I’m trying to create multiple two may summary tables from a list of variables. Ideally I’d then like to append those tables.
My dataframe looks like this with multiple demographic variables, one ID variable, and one binary variable identifying whether the year was a targeted intervention year:
ID | Gender | Race | AgeCategory | TargetYear |
---|---|---|---|---|
1234 | Male | White | 18-25 | 1 |
5675 | Female | Asian | 30-35 | 0 |
I’m trying to create crosstab summaries for every combination of TargetYear and the demographic variables (TargetYear + Gender, TargetYear + Race, etc.) I know I can likely do this in a loop or function but haven’t had success so far.
I initially tried to approach this with a combination of a for loop and dplyr summary after defining the variables:
library(dplyr)
library(tidyr)
variables <- c("Gender", "Race", "AgeCategory")
agg <- NULL
for (i in variables){
u <- df %>% group_by(TargetYear, (i)) %>% summarize(n=n_distinct(ID)) %>% ungroup() %>% mutate(total=sum(n)) %>% group_by(TargetYear, (i)) %>% pivot_longer((i))
return(u)
bind_rows(agg, u)
}
Where I was expecting:
TargetYear | n | total | name | value |
---|---|---|---|---|
1 | 5 | 30 | Gender | Male |
0 | 10 | 30 | Gender | Male |
1 | 3 | 30 | Gender | Female |
0 | 12 | 30 | Gender | Female |
1 | 10 | 50 | Race | White |
0 | 20 | 50 | Race | White |
1 | 23 | 50 | Race | Asian |
0 | 27 | 50 | Race | Asian |
And instead just got a NULL agg and a u with this:
TargetYear | (i) | n | total |
---|---|---|---|
0 | Gender | 15 | 30 |
1 | Gender | 15 | 30 |
I am very new to loops and functions so I’m not sure where I’m going wrong here. Any help would be appreciated! Especially solutions utilizing loops and functions.
AlexC is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.