I’ve been using ggdist along with distributional and
posterior to visualize Bayesian model results. Huge thanks to the
developers of these packages; they’re a game-changer from the much more tedious way I used to
do things.
I’m struggling with a seemingly simple problem in ggdist, however.
I’m trying to plot density histograms of posterior samples overlaid with
analytical prior PDFs in a faceted layout, with one parameter per panel.
I can’t figure out how to make the y-axis (density) scales (1)
commensurate between slab geoms within a panel, so the prior and
posterior have the same integral, and also (2) free to vary across
panels. Consider a toy example:
# setup
library(dplyr)
library(distributional)
library(ggplot2)
library(ggdist)
theme_set(theme_bw(base_size = 16))
# posterior draws
samples <- data.frame(parameter = factor(rep(letters[1:2], 1000)),
value = rnorm(2000, mean = 0, sd = c(1,100)))
# prior PDFs using distributional
pdfs <- data.frame(parameter = factor(letters[1:2]),
pdf = c(dist_normal(0,2), dist_normal(0,200)))
What I want is the equivalent of this base plot. (Apologies for the linked images; apparently I’m not reputable enough to post images inline even in my own question, contrary to this explanation.)
fx <- function(x, pdf) unlist(density(pdf, x))
par(mfcol = c(1,2), mar = c(3,2,4,2), oma = c(2,2,0,0))
for(i in levels(samples$parameter)) {
hist(samples$value[samples$parameter == i], prob = TRUE,
xlab = "", ylab = "", main = i, las = 1)
curve(fx(x, pdfs$pdf[pdfs$parameter == i]), lwd = 2, add = TRUE)
}
mtext(c("value","density"), side = c(1,2), outer = TRUE, line = c(0,1), cex = 1.3)
Here’s one attempt in ggdist:
samples %>%
ggplot() +
stat_slab(aes(x = value), density = "histogram", normalize = "panels") +
stat_slab(data = pdfs, aes(xdist = pdf), normalize = "panels", col = "black", fill = NA) +
facet_wrap(vars(parameter), scales = "free") +
labs(x = "value", y = "density", title = "normalize = 'panels'",
subtitle = "equalizes max densities, not integrals")
This normalizes the maximum thicknesses of the prior and posterior
slabs, independently, within each panel. Thus the plotted histogram
and PDF are not comparable; they don’t have the same integral. From
reading the doc and vignette, I thought scale_thickness_shared()
would resolve this
issue, and it does work as expected when a single parameter is plotted.
But when faceting, this causes the free y-axis scales to be ignored,
as if I had changed normalize = "panels"
to "none"
:
samples %>%
ggplot() +
stat_slab(aes(x = value), density = "histogram", normalize = "panels") +
stat_slab(data = pdfs, aes(xdist = pdf), normalize = "panels", col = "black", fill = NA) +
scale_thickness_shared() +
facet_wrap(vars(parameter), scales = "free") +
labs(x = "value", y = "density", title = "normalize = 'panels', scale_thickness_shared()",
subtitle = "equalizes integrals, but density scales are no longer free")
I’ve tried alternative normalize
settings and the renormalize
argument to scale_thickness_shared()
to no avail. Ideally I would
prefer the heights of the density histogram and the PDF to be the actual
numerical densities, as in the base graphics version. I don’t fully
understand how the slab thickness
aesthetic works, but I gather this
isn’t possible because thickness
is somehow independent of, and
rescaled relative to, the y-axis scale. That’s not a deal-breaker
since the absolute density values are meaningless for my purposes, but I do
need the integrals of the two slabs to be the same within each panel and
the scales to be free across panels.
I’m probably missing something obvious. Any help greatly appreciated!