I have a dataset of a person-specific id variable and five binary variables, namely the results of tests on several sexually transmittable infections (STI). So every line is a person and every column is an STI that can be either 0 (negative) or 1 (positive).
id result_ct result_gn result_lues result_hiv_lab result_hiv_rapid
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 xxxxxxxxxxxxx1 0 0 0 0 0
2 xxxxxxxxxxxxx2 0 0 0 0 0
3 xxxxxxxxxxxxx3 0 1 0 0 0
4 xxxxxxxxxxxxx4 0 0 0 0 0
...
I’d like to have a plot with all 5 variables on the x axis and their total count plotted in bars. Basically, the plot should just add all “1”s of the variable “result_ct” of the whole dataset, say it adds up to 20, and display a bar accordingly, the same for the other four variables.
I guess that the dataset should be in long format for that, what I already accomplished (example below), but now I don’t know what to do with it. Many thanks for your help!
id variable value
1 xxxxxxxxxxxxx1 result_ct 0
2 xxxxxxxxxxxxx2 result_ct 0
3 xxxxxxxxxxxxx3 result_ct 0
4 xxxxxxxxxxxxx1 result_gn 0
5 xxxxxxxxxxxxx2 result_gn 1
6 xxxxxxxxxxxxx2 result_gn 0
...
The best I could do until now was
ggplot(long, aes(value)) + geom_histogram(aes(fill=variable), position=”dodge”)
What didn’t help, because the STI variable shouldn’t be the fill-variable but directly on the x axis.
David Vader is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.