I am wanting to make a bar plot, with error bars representing standard deviations and an overlay of a scatter plot to show individual data points. The data is microbial count data and there are 2 conditions, for which there are three time points. For 5 out of the 6 conditions, some of the individual data points are zero, which I inputted as 1 in order to have a data point showing (in the grand scheme of things, the 0 to 1 transformation does not change much given how large the other numbers are). However, when plotted, the means for 3 of the bars (the “Copper” bars are not correct). I assume it’s due to the transformation of the y-axis in log10 scale but I can’t figure out how to fix it!
I am quite new to R and even more to ggplot2 so I’m at a loss here. Any help would be greatly appreciated.
Below is the code I have tried, with a subset of the data.
Data:
Material<-c("Copper T0", "Copper T0","Copper T0","Copper T0","Copper T0","Copper T0","Copper T0","Copper T0","Copper T0",
"Copper T+1h","Copper T+1h","Copper T+1h","Copper T+1h","Copper T+1h","Copper T+1h","Copper T+1h","Copper T+1h","Copper T+1h",
"Copper T+2h","Copper T+2h","Copper T+2h","Copper T+2h","Copper T+2h","Copper T+2h","Copper T+2h","Copper T+2h","Copper T+2h",
"Steel T0", "Steel T0","Steel T0","Steel T0","Steel T0","Steel T0","Steel T0","Steel T0","Steel T0",
"Steel T+1h", "Steel T+1h","Steel T+1h","Steel T+1h","Steel T+1h","Steel T+1h","Steel T+1h","Steel T+1h","Steel T+1h",
"Steel T+2h","Steel T+2h","Steel T+2h","Steel T+2h","Steel T+2h","Steel T+2h","Steel T+2h","Steel T+2h","Steel T+2h")
CFU<-c(1,1,1,1,1,1,1,1,12500,
1,1,1,1,1,1,1,1,12500,
1,1,1,1,1,1,1,1,12500,
262500,137500,437500,112500,37500,62500,250000,225000,50000,
50000,112500,37500,62500,75000,1,12500,87500,1,
1,1,25000,12500,1,1,137500,150000,112500)
data3<-data.frame(Material,CFU)
print(data3)
Code:
library(readxl)
library(ggplot2)
library(Hmisc)
library(scales)
order<-c("Steel T0", "Steel T+1h", "Steel T+2h", "Copper T0", "Copper T+1h", "Copper T+2h")
truncate_error_bars <- function(x) {
data <- mean_sdl(x, mult = 1)
data$ymin <- pmax(data$ymin, 0)
return(data)
}
ggplot(data3, aes(x=factor(Material, order), y=`CFU`, fill=factor(Material))) +
stat_summary(fun=mean, geom="bar", show.legend=FALSE, color="black") +
stat_summary(fun.data=truncate_error_bars,
geom="errorbar", width=.2) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(),axis.line=element_line(colour = "darkgrey"))+
scale_y_continuous(trans= 'log10', expand = c(0, 0.0001),
breaks = trans_breaks("log10",n=8, function(x) 10^x))+
geom_point(position = position_jitter(width = 0.2), color = "black", show.legend=FALSE)+
expand_limits(x = 0, y = c(0,1e+07))+
theme(axis.title.x=element_blank(), axis.ticks.x=element_blank())+
scale_fill_manual(values=c("sienna3",
"sienna3",
"sienna3",
"lightslategrey",
"lightslategrey",
"lightslategrey")) +
ylab(bquote(PFU.cm^-2))
The graph I got from this is the following:
Bar chart
The means that are showing on the graph are not the correct ones for the data. I would like the bars to reach the correct means if possible.
Thank you so much for you help in advance!
Emma Chareyre is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.