Through the front door

library(CausalQueries)
library(dplyr)
library(knitr)

Here is an example of a model in which X causes M and M causes Y. There is, in addition, unobservable confounding between X and Y. This is an example of a model in which you might use information on M to figure out whether X caused Y making use of the “front door criterion.”

The DAG is defined using dagitty syntax like this:

model <- make_model("X -> M -> Y <-> X")

We might set priors thus:

model <- set_priors(model, distribution = "jeffreys")
#> Altering all parameters.

You can plot the dag like this:

plot(model)
Front door model
Front door model

Updating is done like this:

# Lets imagine highly correlated data; here an effect of .9 at each step
data <- data.frame(X = rep(0:1, 2000)) |>
  mutate(
    M = rbinom(n(), 1, .05 + .9*X),
    Y = rbinom(n(), 1, .05 + .9*M))

# Updating
model <- model |> update_model(data, refresh = 0)

Finally you can calculate an estimand of interest like this:

query_model(
    model = model,
    using = c("priors", "posteriors"),
    query = "Y[X=1] - Y[X=0]",
    ) |>
  kable(digits = 2)
label query given using case_level mean sd cred.low cred.high
Y[X=1] - Y[X=0] Y[X=1] - Y[X=0] - priors FALSE 0.00 0.14 -0.34 0.29
Y[X=1] - Y[X=0] Y[X=1] - Y[X=0] - posteriors FALSE 0.79 0.02 0.76 0.82

This uses the posterior distribution and the model to assess the average treatment effect estimand.

Let’s compare now with the case where you do not have data on M:


model |>
  update_model(data |> dplyr::select(X, Y), refresh = 0) |>
  query_model(
    using = c("priors", "posteriors"),
    query = "Y[X=1] - Y[X=0]") |>
  kable(digits = 2)
label query given using case_level mean sd cred.low cred.high
Y[X=1] - Y[X=0] Y[X=1] - Y[X=0] - priors FALSE 0.0 0.14 -0.34 0.34
Y[X=1] - Y[X=0] Y[X=1] - Y[X=0] - posteriors FALSE 0.1 0.17 -0.03 0.60

Here we update much less and are (relatively) much less certain in our beliefs precisely because we are aware of the confounded related between X and Y, without having the data on M we could use to address it.

Try it

Say X, M, and Y were perfectly correlated. Would the average treatment effect be identified?