I have a large dataset and I need to count the total occurrences of “no” across multiple columns.
The dataset looks like this:
id = c(1,2,3,4,5,6,7,8)
trat = c("a","b","a","b","a","b","a","b")
var1 = c("no","no","no","no","yes",NA,NA,"no")
var2 = c("yes","yes","no","no","yes","no",NA,"no")
df = data.frame(id,trat,var1,var2)
Resulting the following table:
id trat var1 var2
1 1 a no yes
2 2 b no yes
3 3 a no no
4 4 b no no
5 5 a yes yes
6 6 b <NA> no
7 7 a <NA> <NA>
8 8 b no no
The result I want to achieve is the following:
id trat var1 var2 count_no
1 1 a no yes 1
2 2 b no yes 1
3 3 a no no 2
4 4 b no no 2
5 5 a yes yes 0
6 6 b <NA> no 1
7 7 a <NA> <NA> 0
8 8 b no no 2
How can I do that?
Thanks!
0
An approach using across
and rowSums
library(dplyr)
df %>%
mutate(count_no = rowSums(across(var1:var2, ~ .x == "no"), na.rm=T))
id trat var1 var2 count_no
1 1 a no yes 1
2 2 b no yes 1
3 3 a no no 2
4 4 b no no 2
5 5 a yes yes 0
6 6 b <NA> no 1
7 7 a <NA> <NA> 0
8 8 b no no 2
This is not a tidyverse solution, but:
df$count_no <- apply(df[,c("var1", "var2")], 1,
function(x) sum(x=="no", na.rm= TRUE))
seems to work fine … in tidyverse, try
df |>
dplyr::rowwise() |>
dplyr::mutate(count_no = sum(c(var1,var2)=="no", na.rm = TRUE))