Here is an example of a dataframe:
df <- tibble(col_1 = c(0, 1, 1, 0, 1))
Basically, I want to be able to take the sum of the 1s (3) over the sum of the 0s and 1s (5). How would I go about doing this for one column? Also, is there a way to do this for multiple columns that are structured the same way? Thank you.
I tried using different R functions, but alas, I can’t figure out what to do.
0
You are looking for mean()
df <- tibble(col_1 = c(0, 1, 1, 0, 1))
mean(df$col_1)
[1] 0.6
For multiple columns, try:
df %>%
reframe(across(everything(),mean))
Use sum()/colSums()
with a specification (x == 1L
/ x %in% 0:1
). Depending on your data structure and the desired result you might need to specify na.rm=TRUE
in sum
/colSums
.
For df0
you can do
df0 = data.frame(col_1 = c(0, 1, 1, 0, 1))
MASS::fractions(sum(df0$col_1 == 1L) / sum(df0$col_1 %in% 0L:1L))
giving
[1] 3/5
and for df1
df1 = data.frame(col_1 = c(0, 1, 1, 0, 1), col_2 = c(5, 1, 1, 0, 1))
MASS::fractions((i<-colSums(df1==1L)) / (colSums(df1==0L)+i))
we obtain
col_1 col_2
3/5 3/4
Assumption: There can be other values than 0
and 1
.
MASS::fractions()
is probably not really needed.
Base R solution with table/proportions
, which are meant for this type of problems.
df <- tibble::tibble(col_1 = c(0, 1, 1, 0, 1))
proportions(table(df$col_1))
#>
#> 0 1
#> 0.4 0.6
df2 <- tibble::tibble(col_1 = c(0, 1, 1, 0, 1), col_2 = c(0, 1, 1, 0, 0))
sapply(df2, (x) proportions(table(x)))
#> col_1 col_2
#> 0 0.4 0.6
#> 1 0.6 0.4
Created on 2024-09-24 with reprex v2.1.0