I’m trying to create a table to describe data which has multiple dichotomous columns describing whether a particular event happened or not.
Single IDs can have multiple events and there are multiple sub-types of events (described by inf_<n>
and sub_inf_<n>_t<n>
.
This is the code I have so far.
set.seed(123)
library(tidyverse)
library(gtsummary)
library(kableExtra)
df <- tidyr::tibble(
id = c(1,2,3,4,5,6,7,8,9,10),
inf_1 = structure(factor(sample(seq(0,1), size = 10, replace = TRUE), levels = c(0,1), labels = c("Unchecked", "Checked")), label = "IA"),
inf_2 = structure(factor(rep_len(0, length.out = 10), levels = c(0,1), labels = c("Unchecked", "Checked")), label = "PNEU"),
inf_3 = structure(factor(sample(seq(0,1), size = 10, replace = TRUE), levels = c(0,1), labels = c("Unchecked", "Checked")), label = "BSI"),
inf_4 = structure(factor(sample(seq(0,1), size = 10, replace = TRUE), levels = c(0,1), labels = c("Unchecked", "Checked")), label = "other"),
sub_inf_1_t1 = structure(factor(sample(seq(0,1), size = 10, replace = TRUE), levels = c(0,1), labels = c("Unchecked", "Checked")), label = "lower"),
sub_inf_1_t2 = structure(factor(sample(seq(0,1), size = 10, replace = TRUE), levels = c(0,1), labels = c("Unchecked", "Checked")), label = "upper"),
sub_inf_1_t3 = structure(factor(c(0,0,0,1,1,0,NA,NA,0,1), levels = c(0,1), labels = c("Unchecked", "Checked")), label = "other"),
)
df %>%
mutate(across(where(is.factor), ~ forcats::fct_recode(.x, "No" = "Unchecked", "Yes" = "Checked"))) %>%
gtsummary::tbl_summary(data = ., include = -id, statistic = list(all_categorical() ~ "{n} / {N} ({p}%)")) %>%
kable()
Characteristic | N = 10 |
---|---|
IA | 5 / 10 (50%) |
PNEU | 0 / 10 (0%) |
BSI | 2 / 10 (20%) |
other | 3 / 10 (30%) |
lower | 5 / 10 (50%) |
upper | 8 / 10 (80%) |
other | 3 / 8 (38%) |
Unknown | 2 |
Created on 2024-12-03 with reprex v2.1.1
What I would instead like is to have main types of event ordered by frequency with other
being at the end of the list and subtypes of events being shown directly under the main event type and indented. Like the following:
Characteristic | N = 10 |
---|---|
IA | 5 / 10 (50%) |
lower | 5 / 10 (50%) |
upper | 8 / 10 (80%) |
other | 3 / 8 (38%) |
Unknown | 2 |
BSI | 2 / 10 (20%) |
PNEU | 0 / 10 (0%) |
other | 3 / 10 (30%) |
Any way to do this?
Thank you.
0
Two options:
(1) Using modify_column_indent
df %>%
mutate(across(where(is.factor), ~ forcats::fct_recode(.x, "No" = "Unchecked", "Yes" = "Checked"))) %>%
tbl_summary(include =-id,
statistic = list(all_categorical() ~ "{n} / {N} ({p}%)")) %>%
modify_column_indent(
columns = label,
rows = variable %in% c("sub_inf_1_t1", "sub_inf_1_t2", "sub_inf_1_t3"))
(2) Using bstfun::add_variable_grouping
devtools::install_github("MSKCC-Epi-Bio/bstfun")
library(bstfun)
df %>%
mutate(across(where(is.factor), ~ forcats::fct_recode(.x, "No" = "Unchecked", "Yes" = "Checked"))) %>%
tbl_summary(include = c(inf_1, starts_with("sub_inf_1"), starts_with("inf_")),
statistic = list(all_categorical() ~ "{n} / {N} ({p}%)")) %>%
add_variable_grouping(
"Subtype" = c("sub_inf_1_t1", "sub_inf_1_t2", "sub_inf_1_t3")
)
Data:
Always set the seed when simulating data.
set.seed(123)
df <- tibble(
id = 1:10,
inf_1 = structure(factor(sample(seq(0,1), size = 10, replace = TRUE), levels = c(0,1), labels = c("Unchecked", "Checked")), label = "IA"),
inf_2 = structure(factor(rep_len(0, length.out = 10), levels = c(0,1), labels = c("Unchecked", "Checked")), label = "PNEU"),
inf_3 = structure(factor(sample(seq(0,1), size = 10, replace = TRUE), levels = c(0,1), labels = c("Unchecked", "Checked")), label = "BSI"),
inf_4 = structure(factor(sample(seq(0,1), size = 10, replace = TRUE), levels = c(0,1), labels = c("Unchecked", "Checked")), label = "Other"),
sub_inf_1_t1 = structure(factor(c(0,1,0,1,0,0,0,0,0,0), levels = c(0,1), labels = c("Unchecked", "Checked")), label = "lower"),
sub_inf_1_t2 = structure(factor(c(0,1,0,1,1,0,0,0,0,0), levels = c(0,1), labels = c("Unchecked", "Checked")), label = "upper"),
sub_inf_1_t3 = structure(factor(c(rep(0,9),NA), levels = c(0,1), labels = c("Unchecked", "Checked")), label = "other"),
)
3