I have a pseudo table like this.
<code>dat <- data.frame(
colA = c("factor1", "factor1", "factor1", "factor1","factor2", "factor2","factor2", "factor2"),
colB = c("a", "a", "a", NA, "a", "a", "a", NA),
colC = c(1, 1, 1, 5, 2,4,5,6))
dat
colA colB colC
1 factor1 a 1
2 factor1 a 1
3 factor1 a 1
4 factor1 <NA> 5
5 factor2 a 2
6 factor2 a 4
7 factor2 a 5
8 factor2 <NA> 6
</code>
<code>dat <- data.frame(
colA = c("factor1", "factor1", "factor1", "factor1","factor2", "factor2","factor2", "factor2"),
colB = c("a", "a", "a", NA, "a", "a", "a", NA),
colC = c(1, 1, 1, 5, 2,4,5,6))
dat
colA colB colC
1 factor1 a 1
2 factor1 a 1
3 factor1 a 1
4 factor1 <NA> 5
5 factor2 a 2
6 factor2 a 4
7 factor2 a 5
8 factor2 <NA> 6
</code>
dat <- data.frame(
colA = c("factor1", "factor1", "factor1", "factor1","factor2", "factor2","factor2", "factor2"),
colB = c("a", "a", "a", NA, "a", "a", "a", NA),
colC = c(1, 1, 1, 5, 2,4,5,6))
dat
colA colB colC
1 factor1 a 1
2 factor1 a 1
3 factor1 a 1
4 factor1 <NA> 5
5 factor2 a 2
6 factor2 a 4
7 factor2 a 5
8 factor2 <NA> 6
I want to sum the values in colC
, group_by
by colA
, but only including rows where colB
does not have (NA)
. After that, I’d like to calculate the frequency of the resulting sums.
My desired outputs will be:
colA | colB | colC |
---|---|---|
factor1 | a | 3 |
factor1 | NA | 5 |
factor2 | a | 11 |
factor2 | NA | 6 |
colA | colB | Freq |
---|---|---|
factor1 | a | 0.375 |
factor1 | NA | 0.625 |
factor2 | a | 0.65 |
factor2 | NA | 0.35 |
Any suggestions for this? Thanks in advance!