I hope someone can help me with this problem. I have no background in computer science or programing, which makes it hard to understand why my code doesn’t work.
My goal is to create a number of plots, like the one below. To test for differences between means I’am using a mixed linear model (lme4::lmer + lmerTest for p-values) together with pairwise comparisons using the emmeans-package.
enter image description here
In another post on stack overflow (answered by learners) I found a method to extract the contrasts-output and how to add them to a boxplot using the ggpubr-package. I works fine to do a plot with significance levels manually. However, the code is quite verbose so I would prefer to create a function in order to reduce the risk for errors when creating multiple plots using different variables and situations/conditions.
Below is an erroneous code, but it illustrates what I try to accomplish. I have understood, that it can be complicated to introduce a formula inside a function, and the formula arguments may needs to be transferred to a character and then reused with as.formula
, but I have not been able to do this. Creating the plot with a function works fine, but not formulating the mixed model and extracting/adding the significance levels.
library(tidyverse)
library(lmerTest)
library(lme4)
library(reshape) # using melt()
library(ggpubr) # using stat_pvalue_manual()
plot_signf <- function(df, condition, var){
model <- df %>%
filter(prov == {{condition}}) %>%
lmer(log(var) ~ part + (1|subject), data = .)
summary(model)
plot(model)
signif <- emmeans(model, specs = pairwise ~ part, type = "response")
#Extracting emmeans-contrats output
p.val.test <- pwpm(signif, means = FALSE, flip = TRUE,reverse = TRUE)
p.val.test <- sub("[<>]", "", p.val.test)
p.matx <- matrix(as.numeric((p.val.test)),nrow = length(p.val.test[,1]),ncol = length(p.val.test[,1]))
rownames(p.matx) <- colnames(p.matx) <- colnames(p.val.test)
p.matx[upper.tri(p.matx, diag=FALSE)] <- NA
stat.test <- subset(melt(p.matx),!is.na(value))
#ggpubr::stat_pvalue_manual() needs data organized in group 1-2 + p-value.
names(stat.test) <- c("group1","group2","p.adj")
stat.test[stat.test$p.adj<=0.001,"p.adj.signif"] <- "***"
stat.test[stat.test$p.adj>0.001 & stat.test$p.adj<=0.01,"p.adj.signif"] <- "**"
stat.test[stat.test$p.adj>0.01 & stat.test$p.adj<=0.05,"p.adj.signif"] <- "*"
stat.test[ stat.test$p.adj>0.05,"p.adj.signif"] <- "ns"
df %>%
filter(prov == {{condition}}) %>%
ggplot(aes(x = part, y = {{var}}))+
geom_boxplot(staplewidth = 0.3, outliers = FALSE)+
geom_jitter(aes(color = part), alpha = 0.6, width = 0.2)+
stat_pvalue_manual(stat.test,
y.position = max(df$var[df$prov == {condition}], na.rm=TRUE),
step.increase = 0.1)+
theme_pubr()+
theme(legend.position = "element_blank")
}
plot_signf(df = mean, condition = "snp", var = IR_AUC)
Any help appreciated. Thanks in advance.
e.naslund is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.