I want to generate a plot which keeps all the colors in the legend, even for a subset of the data. When i create a reproducible example it works, and i am going out of my mind trying to work out what i am missing. Below is my working reproducible example and a dput of my code that won’t work. Appreciate any help correcting this.
#First my working case:
library("RColorBrewer")
library("ggplot2")
#Create the color values for the data
gear_fill = brewer.pal(n = 3, "Dark2")
names(gear_fill) <- sort(unique(mtcars$gear))
#Plot the data
mtcars %>%
filter(gear %in% c(4, 5)) %>%
ggplot(aes(x = gear,
y = mpg,
fill = factor(gear, sort(unique(mtcars$gear))))) +
geom_col(show.legend = TRUE) +
scale_fill_manual(name = "Gear",
drop = FALSE,
values = gear_fill)
Correctly shows the color for 3 gears even though I filtered it out:
#Now my problem
#Create data from dput - data anonymized
data1 <- structure(list(project.code = c("External Event", "External Event",
"External Event", "External Event", "External Event", "External Event",
"External Event", "External Event", "Internship", "Training/Course",
"Training/Course", "Training/Course"), university.people = c(" ",
"h", "m", "n", "o", "r", "s", "t", "r", "m", "s", "t"), university.partner = c("uni1",
"uni1", "uni1", "uni1", "uni1", "uni1", "uni1", "uni1", "uni1",
"uni1", "uni1", "uni1"), n = c(2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 3L, 2L, 3L)), class = c("grouped_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -12L), groups = structure(list(project.code = c("External Event",
"External Event", "External Event", "External Event", "External Event",
"External Event", "External Event", "External Event", "Internship",
"Training/Course", "Training/Course", "Training/Course"), university.people = c(" ",
"h", "m", "n", "o", "r", "s", "t", "r", "m", "s", "t"), .rows = structure(list(
1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -12L), .drop = TRUE))
#Get the factor levels - this is a small recreation of my full dataset
data2 <- data.frame(project.code = c("Internship",
"Research collaboration",
"External Event",
NA,
"Training/Course",
"Internal Event",
"Staff Development"))
#Create my color values
project_code_colors <- brewer.pal(6, "Dark2")
names(project_code_colors) <- sort(unique(data2$project.code))
#Attempt to visualize
data1 %>%
ggplot(aes(y = university.people, x = n,
fill = factor(project.code, levels = sort(unique(data2$project.code))))) +
geom_col(col = "black") +
theme_bw() +
labs(x = element_blank(), y = element_blank(),
fill = "Partnership type:", ) +
scale_fill_manual(drop = FALSE,
values = c(project_code_colors, na.value = "grey"))
Shows all the levels by has empty color values for cases that are not in the data:
If someone can spot the error, i would welcome it.