I’m trying to use a continuous colour scale, tweaked based on some property of the data. In the following example, let’s say I want to highlight the points for which cc sits on or near the first quartile:
dd <- tibble(aa=runif(50),bb=runif(50),cc=runif(50))
ggplot(dd)+
geom_point(mapping=aes(x=aa,y=bb,colour=cc))+
scale_colour_gradientn(colours=c("blue","red","blue"),
values = scales::rescale(c(0,quantile(dd$cc,0.25),1) ) )
This works as expected, but I need to explicitely call dd$cc
in the call to scale_colour_gradientn
. This is normally not a problem, but now I’m trying to put that into a pipeline, like so:
dd <- tibble(aa=runif(50),bb=runif(50),cc=runif(50))
dd %>% mutate(ee=aa/cc) %>%
ggplot()+
geom_point(mapping=aes(x=aa,y=bb,colour=ee))+
scale_colour_gradientn(colours=c("blue","red","blue"),
values = scales::rescale(c(0,quantile(dd$ee,0.25),1) ) )
Of course, this does not work. At the point where I call scale_colour_gradientn
, dd has no column called ee, so dd$ee
is meaningless. Also, I think (but I’m not too good with the intricacies of data-masking and such) that since scale_colour_grandientn
takes no data
argument, there is no way it can know what happened upstream in the pipeline.
And of course, in this example, it is easy to create an intermediate variable with e.g.
dd2 <- dd %>% mutate(ee=aa/cc)
ggplot(dd2)+
geom_point(mapping=aes(x=aa,y=bb,colour=ee))+
scale_colour_gradientn(colours=c("blue","red","blue"),
values = scales::rescale(c(0,quantile(dd2$ee,0.25),1) ) )
But, for the sake of the argument, let’s say I want everything to run in one go (ctrl+ENTER in RStudio), and/or I don’t want to add intermediate variables to the workspace.
Is there a way to write something like my second code:
dd %>% mutate(ee=aa/cc) %>%
ggplot()+
geom_point(mapping=aes(x=aa,y=bb,colour=ee))+
scale_colour_gradientn(colours=c("blue","red","blue"),
values = scales::rescale(c(0,
quantile(SOME MAGIC HERE TO GET ee,0.25),
1) ) )
Thanks !