It’s clear I’m not understanding the math behind what I wish to accomplish. I want a stacked bar chart showing the percent contribution of each contamination source to the total ASV counts per sample. My example data trytry
(dput()
below) contains simplified information including the sample name ($name
), ASV genus ($Genus
), the control the ASV originated from ($cont
), and the relative ASV abundance ($contmeanabund
) calculated using decostand(x, method="total")
on the original ASV count matrix. The column $perc2
was calculated with the code below, and $perc
are these values from the whole dataset.
sum <- sum(as.numeric(trytry$contmeanabund))
trytry$perc2 <- (as.numeric(trytry$contmeanabund) / sum) * 100
When I visualize trytry
I am hoping to see full bars (to 100%) for each sample including 3H2C where all of that sample would be blue, or shipcontrolsea. Instead I see bars that reflect the relative abundance of the total counts.
ugh <- ggplot(trytry, aes(x=name,y=perc2,fill=cont)) +
geom_bar(position="stack", stat="identity") +
theme_bw() +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
print(ugh)
Any thoughts?
dput(trytry)
structure(list(name = c("3H1C", "3H1C", "3H2C", "3H2C", "2H5C",
"3H1C", "3H1C"), Genus = c("SAR116_clade", "SAR116_clade", "SAR116_clade",
"SAR116_clade", "Nitrosopumilaceae", "Nitrosopumilaceae", "Nonlabens"
), cont = c("shipcontrolsea", "shipcontrolsea", "shipcontrolsea",
"shipcontrolsea", "shipcontrolcore", "shipcontrolcore", "shipcontrolsea"
), perc = c(0.00166599678693457, 0.00166599678693457, 0.000187516304444513,
0.000187516304444513, 0.00929045574925936, 0.0042522379814814,
0.00166599678693457), contmeanabund = c(0.00268791084381583,
0.00268791084381583, 0.000302537863254877, 0.000302537863254877,
0.0149891746180217, 0.00686053939031891, 0.00268791084381583),
perc2 = c(8.80747377072101, 8.80747377072101, 0.991325401062996,
0.991325401062996, 49.1150078867825, 22.4799199989284, 8.80747377072101
)), row.names = c(NA, -7L), class = c("grouped_df", "tbl_df",
"tbl", "data.frame"), groups = structure(list(name = c("2H5C",
"3H1C", "3H1C", "3H2C"), cont = c("shipcontrolcore", "shipcontrolcore",
"shipcontrolsea", "shipcontrolsea"), .rows = structure(list(5L,
6L, c(1L, 2L, 7L), 3:4), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -4L), .drop = TRUE))