I am trying to use the package ggdag
in R to better understand the results of my modelling.
If I wish to model the relationship response ~ event
I can easily find which variables to control for:
library(ggdag)
dagify(
response ~ event + confounder,
event ~ confounder,
exposure = c("event"),
outcome = c("response")
) |> ggdag_adjustment_set()
However, if I am now interested in modeling the reverse relationship, that is event ~ response
(for example, to predict the probability of event
given response
), I get a warning saying that it is not possible to close backdoor paths:
library(ggdag)
#>
#> Attaching package: 'ggdag'
#> The following object is masked from 'package:stats':
#>
#> filter
dagify(
response ~ event + confounder,
event ~ confounder,
exposure = c("response"),
outcome = c("event")
) |> ggdag_adjustment_set()
#> Warning in dag_adjustment_sets(., exposure = exposure, outcome = outcome, : Failed to close backdoor paths. Common reasons include:
#> * graph is not acyclic
#> * backdoor paths are not closeable with given set of variables
#> * necessary variables are unmeasured (latent)
The same happens in an even simpler DAG:
dagify(
response ~ event + confounder,
event ~ confounder,
exposure = c("response"),
outcome = c("event")
) |> ggdag_adjustment_set()
It appears to me that since the direction of causation is event -> response
the path I am asking for (response -> event
) is interpreted by ggdag as a backdoor path, hence the warning.
Since my actual model is much more complex than these simple examples, I would still like to get an adjustement set from ggdag.
What would be the correct specification of the DAG? Should one just leave the “exposure” and “outcome” as in the first case, even if modelling in the reverse direction?