--- title: "Inspecting posteriors" output: rmarkdown::html_vignette: md_extensions: [ "-autolink_bare_uris" ] vignette: > %\VignetteIndexEntry{Inspecting posteriors} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup, message = FALSE, warning = FALSE} library(CausalQueries) library(knitr) library(ggplot2) library(rstan) library(bayesplot) rstan_options(refresh = 0) ``` # Accessing the posterior When you update a model using `CausalQueries`, `CausalQueries` generates and updates a `stan` model and saves the posterior distribution over parameters in the model. The basic usage is: ```{r, eval = FALSE} data <- data.frame(X = rep(c(0:1), 10), Y = rep(c(0:1), 10)) model <- make_model("X -> Y") |> update_model(data) ``` ```{r, include = FALSE} data <- data.frame(X = rep(c(0:1), 10), Y = rep(c(0:1), 10)) model <- make_model("X -> Y") |> update_model(data, refresh = 0) ``` The posterior over parameters can be accessed thus: ```{r} inspect(model, "posterior_distribution") ``` When querying a model you can request use of the posterior distribution with the `using` argument: ```{r} model |> query_model( query = "Y[X=1] > Y[X=0]", using = c("priors", "posteriors")) |> kable(digits = 2) ``` # Summary of stan performance You can access a summary of the parameter values and convergence information as produced by `stan` thus: ```{r} inspect(model, "stan_summary") ``` This summary provides information on the distribution of parameters as well as convergence diagnostics, summarized in the `Rhat` column. In the printout above the first six rows show the distribution of the model parameters; the next eight rows show the distribution over transformed parameters, here the causal types. The last row shows the unnormalized log density on Stan's unconstrained space which, as described in [Stan documentation](https://mc-stan.org/cmdstanr/reference/fit-method-lp.html) is intended to diagnose sampling efficiency and evaluate approximations. See [stan documentation](http://mc-stan.org/rstan/articles/stanfit_objects.html) for further details. # Advanced diagnostics If you are interested in advanced diagnostics of performance you can save and access the `raw` stan output. ```{r, eval = FALSE} model <- make_model("X -> Y") |> update_model(data, keep_fit = TRUE) ``` ```{r, include = FALSE} model <- make_model("X -> Y") |> update_model(data, refresh = 0, keep_fit = TRUE) ``` Note that the summary for this raw output shows the labels used in the generic `stan` model: `lambda` for the vector of parameters, corresponding to the parameters in the parameters dataframe (`inspect(model, "parameters_df")`), and , if saved, a vector `types` for the causal types (see `inspect(model, "causal_types")`) and `w` for the event probabilities (`inspect(model, "prior_event_probabilities")`). ```{r} model |> inspect("stanfit") ``` You can then use diagnostic packages such as `bayesplot`. ```{r} model |> inspect("stanfit") |> bayesplot::mcmc_pairs(pars = c("lambdas[3]", "lambdas[4]", "lambdas[5]", "lambdas[6]")) ``` ```{r} np <- model |> inspect("stanfit") |> bayesplot::nuts_params() head(np) |> kable() model |> inspect("stanfit") |> bayesplot::mcmc_trace(pars = "lambdas[5]", np = np) ```