I’ve been using ggdist along with distributional and
posterior to visualize Bayesian model results.
I’m struggling with a seemingly simple problem in ggdist, however.
I’m trying to plot density histograms of posterior samples overlaid with
analytical prior PDFs in a faceted layout, with one parameter per panel.
I can’t figure out how to make the y-axis (density) scales (1)
commensurate between slab geoms within a panel, so the prior and
posterior have the same integral, and also (2) free to vary across
panels. Consider a toy example:
# setup
library(dplyr)
library(distributional)
library(ggplot2)
library(ggdist)
theme_set(theme_bw(base_size = 16))
# posterior draws
samples <- data.frame(parameter = factor(rep(letters[1:2], 1000)),
value = rnorm(2000, mean = 0, sd = c(1,100)))
# prior PDFs using distributional
pdfs <- data.frame(parameter = factor(letters[1:2]),
pdf = c(dist_normal(0,2), dist_normal(0,200)))
What I want is the equivalent of this base plot. (Apologies for the linked images; apparently I’m not reputable enough to post images inline even in my own question, contrary to this explanation.)
fx <- function(x, pdf) unlist(density(pdf, x))
par(mfcol = c(1,2), mar = c(3,2,4,2), oma = c(2,2,0,0))
for(i in levels(samples$parameter)) {
hist(samples$value[samples$parameter == i], prob = TRUE,
xlab = "", ylab = "", main = i, las = 1)
curve(fx(x, pdfs$pdf[pdfs$parameter == i]), lwd = 2, add = TRUE)
}
mtext(c("value","density"), side = c(1,2), outer = TRUE, line = c(0,1), cex = 1.3)
Here’s one attempt in ggdist:
samples %>%
ggplot() +
stat_slab(aes(x = value), density = "histogram", normalize = "panels") +
stat_slab(data = pdfs, aes(xdist = pdf), normalize = "panels",
col = "black", fill = NA) +
facet_wrap(vars(parameter), scales = "free") +
labs(x = "value", y = "density", title = "normalize = 'panels'",
subtitle = "equalizes max densities, not integrals")
This normalizes the maximum thicknesses of the prior and posterior
slabs, independently, within each panel. Thus the plotted histogram
and PDF are not comparable; they don’t have the same integral. From
reading the doc and vignette, I thought scale_thickness_shared()
would resolve this
issue, and it does work as expected when a single parameter is plotted.
But when faceting, this causes the free y-axis scales to be ignored,
as if I had changed normalize = "panels"
to "none"
:
samples %>%
ggplot() +
stat_slab(aes(x = value), density = "histogram") +
stat_slab(data = pdfs, aes(xdist = pdf), col = "black", fill = NA) +
scale_thickness_shared() +
facet_wrap(vars(parameter), scales = "free") +
labs(x = "value", y = "density", title = "scale_thickness_shared()",
subtitle = "equalizes integrals, but density scales are no longer free")
I’ve tried alternative normalize
settings and the renormalize
argument to scale_thickness_shared()
to no avail. Ideally I would
prefer the heights of the density histogram and the PDF to be the actual
numerical densities, as in the base graphics version. I don’t fully
understand how the slab thickness
aesthetic works, but I gather this
isn’t possible because thickness
is somehow independent of, and
rescaled relative to, the y-axis scale. That’s not a deal-breaker
since the absolute density values are meaningless for my purposes, but I do
need the integrals of the two slabs to be the same within each panel and
the scales to be free across panels.
I’m probably missing something obvious.