StatsPlots.jl
MCMCChains implements many functions for plotting via StatsPlots.jl.
Simple example
The following simple example illustrates how to use Chain to visually summarize a MCMC simulation:
using MCMCChains
using StatsPlots
# Define the experiment
n_iter = 100
n_name = 3
n_chain = 2
# experiment results
val = randn(n_iter, n_name, n_chain) .+ [1, 2, 3]'
val = hcat(val, rand(1:2, n_iter, 1, n_chain))
# construct a Chains object
chn = Chains(val, [:A, :B, :C, :D])
# visualize the MCMC simulation results
plot(chn; size=(840, 600))
GKS: cannot open display - headless operation mode active
plot(chn, colordim = :parameter; size=(840, 400))
Note that the plot function takes the additional arguments described in the Plots.jl package.
Mixed density
plot(chn, seriestype = :mixeddensity)
Or, for all seriestypes, use the alternative shorthand syntax:
mixeddensity(chn)
Trace
plot(chn, seriestype = :traceplot)
traceplot(chn)
Running average
meanplot(chn)
Density
density(chn)
Histogram
histogram(chn)
Autocorrelation
autocorplot(chn)
Violin
Violin plots are similar to box plots but also show the probability density of the data at different values, smoothed by a kernel density estimator.
violinplot(chn) # Plotting parameter 1 across all chains
violinplot(chn, 1) # Plotting parameter 1 across all chains
violinplot(chn, :A) # Plotting a specific parameter across all chains
violinplot(chn, [:C, :B, :A]) # Plotting multiple specific parameters across all chains
violinplot(chn, 1, colordim = :parameter) # Plotting chain 1 across all parameters
violinplot(chn, show_boxplot = false) # Plotting all parameters without the inner boxplot
You can also aggregate (pool) samples from all chains for a given parameter by using append_chains = true
. This is useful when you want to visualize the overall posterior distribution without distinguishing between individual chains.
violinplot(chn, :A, append_chains = true) # Single parameter, all chains appended
violinplot(chn, append_chains = true) # All parameters, all chains appended
You can also use the plot
function with seriestype = :violinplot
or seriestype = :violin
plot(chn, seriestype = :violin)
Corner
corner(chn)
Energy Plot
The energy plot is a diagnostic tool for HMC-based samplers (like NUTS) that helps diagnose sampling efficiency by visualizing the energy and energy transition distributions. This plot requires that the chain contains the internal sampler statistics :hamiltonian_energy
and :hamiltonian_energy_error
.
# First, we generate a chain that includes the required sampler parameters.
n_iter = 1000
n_chain = 4
val_params = randn(n_iter, 2, n_chain)
val_energy = randn(n_iter, 1, n_chain) .+ 20
val_energy_error = randn(n_iter, 1, n_chain) .* 0.5
full_val = hcat(val_params, val_energy, val_energy_error)
parameter_names = [:a, :b, :hamiltonian_energy, :hamiltonian_energy_error]
section_map = (
parameters=[:a, :b],
internals=[:hamiltonian_energy, :hamiltonian_energy_error],
)
chn_energy = Chains(full_val, parameter_names, section_map)
# Generate the energy plot (default is a density plot).
energyplot(chn_energy)
# The plot can also be generated as a histogram.
energyplot(chn_energy, kind=:histogram)
For plotting multiple parameters, ridgeline, forest and caterpillar plots can be useful.
Ridgeline
ridgelineplot(chn, [:C, :B, :A])
Forest
forestplot(chn, [:C, :B, :A], hpd_val = [0.05, 0.15, 0.25])
Caterpillar
forestplot(chn, chn.name_map[:parameters], hpd_val = [0.05, 0.15, 0.25], ordered = true)
Posterior Predictive Checks (PPC)
Posterior Predictive Checks (PPC) are essential tools for Bayesian model validation. They compare observed data with samples from the posterior predictive distribution to assess whether the model can reproduce key features of the data. Prior Predictive Checks can also be performed to evaluate prior appropriateness before seeing the data.
using Random
Random.seed!(123)
# Generate posterior samples (parameters)
n_iter = 500
posterior_data = randn(n_iter, 2, 2) # μ, σ parameters
posterior_chains = Chains(posterior_data, [:μ, :σ])
# Generate posterior predictive samples
n_obs = 20
pp_data = zeros(n_iter, n_obs, 2)
for i in 1:n_iter, j in 1:2
μ = posterior_data[i, 1, j]
σ = abs(posterior_data[i, 2, j]) + 0.5 # Ensure positive σ
pp_data[i, :, j] = randn(n_obs) * σ .+ μ
end
pp_chains = Chains(pp_data)
# Generate observed data
Random.seed!(456)
observed = randn(n_obs) * 1.2 .+ 0.3
# Basic posterior predictive check (density overlay)
# Note: observed data is shown by default for posterior checks
ppcplot(posterior_chains, pp_chains, observed)
Plot Types
Our PPC implementation supports four main plot types:
Density Plots (Default)
# Density overlay with customized transparency
ppcplot(posterior_chains, pp_chains, observed;
kind=:density, alpha=0.3, num_pp_samples=50)
Histogram Comparison
# Normalized histogram comparison
ppcplot(posterior_chains, pp_chains, observed; kind=:histogram)
Cumulative Distribution Functions
# Empirical CDFs comparison
ppcplot(posterior_chains, pp_chains, observed; kind=:cumulative)
Scatter Plots with Jitter
# Index-based scatter plot with automatic jitter for small samples
ppcplot(posterior_chains, pp_chains, observed;
kind=:scatter, num_pp_samples=8, jitter=0.3)
Advanced Styling and Options
# Comprehensive customization example
ppcplot(posterior_chains, pp_chains, observed;
kind=:density,
colors=[:steelblue, :darkred, :orange], # [predictive, observed, mean]
alpha=0.25,
observed_rug=true, # Add rug plot for observed data
num_pp_samples=75, # Limit predictive samples shown
mean_pp=true, # Show predictive mean
legend=true,
random_seed=42) # Reproducible subsampling
Prior Predictive Checks
Prior predictive checks assess whether priors generate reasonable data before observing actual data. The ppc_group
parameter controls default behavior:
# Prior predictive check - observed data hidden by default
ppcplot(posterior_chains, pp_chains, observed; ppc_group=:prior)
# Prior check with observed data explicitly shown for comparison
ppcplot(posterior_chains, pp_chains, observed;
ppc_group=:prior, observed=true, alpha=0.4)
Controlling Observed Data Display
You can explicitly control whether observed data is shown regardless of the check type:
# Posterior check without observed data
ppcplot(posterior_chains, pp_chains, observed;
ppc_group=:posterior, observed=false)
Performance and Sampling Control
For large datasets or when you want to reduce visual clutter:
# Limit the number of predictive samples displayed
ppcplot(posterior_chains, pp_chains, observed;
num_pp_samples=25,
random_seed=123) # Reproducible results
ppcplot(posterior_chains::Chains, posterior_predictive_chains::Chains, observed_data::Vector;
kind=:density, alpha=nothing, num_pp_samples=nothing, mean_pp=true, observed=nothing,
observed_rug=false, colors=[:steelblue, :black, :orange], jitter=nothing,
legend=true, random_seed=nothing, ppc_group=:posterior)
API
MCMCChains.energyplot
— Functionenergyplot(chains::Chains; kind=:density, kwargs...)
Generate an energy plot for the samples in chains
.
The energy plot is a diagnostic tool for HMC-based samplers like NUTS. It displays the distributions of the Hamiltonian energy and the energy transition (error) to diagnose sampler efficiency and identify divergences.
This plot is only available for chains that contain the :hamiltonian_energy
and :hamiltonian_energy_error
statistics in their :internals
section.
Keywords
kind::Symbol
(default::density
): The type of plot to generate. Can be:density
or:histogram
.
MCMCChains.energyplot!
— Functionenergyplot(chains::Chains; kind=:density, kwargs...)
Generate an energy plot for the samples in chains
.
The energy plot is a diagnostic tool for HMC-based samplers like NUTS. It displays the distributions of the Hamiltonian energy and the energy transition (error) to diagnose sampler efficiency and identify divergences.
This plot is only available for chains that contain the :hamiltonian_energy
and :hamiltonian_energy_error
statistics in their :internals
section.
Keywords
kind::Symbol
(default::density
): The type of plot to generate. Can be:density
or:histogram
.
energyplot(chains::Chains; kind=:density, kwargs...)
Generate an energy plot for the samples in chains
.
The energy plot is a diagnostic tool for HMC-based samplers like NUTS. It displays the distributions of the Hamiltonian energy and the energy transition (error) to diagnose sampler efficiency and identify divergences.
This plot is only available for chains that contain the :hamiltonian_energy
and :hamiltonian_energy_error
statistics in their :internals
section.
Keywords
kind::Symbol
(default::density
): The type of plot to generate. Can be:density
or:histogram
.
MCMCChains.ppcplot
— Functionppcplot(posterior_chains::Chains, posterior_predictive_chains::Chains, observed_data::Vector; kwargs...)
Generate a posterior/prior predictive check (PPC) plot comparing observed data with predictive samples.
PPC plots are a key tool for model validation in Bayesian analysis. They help assess whether the model can reproduce the key features of the observed data by comparing the observed data against samples from the posterior (or prior) predictive distribution.
Arguments
posterior_chains::Chains
: MCMC samples from the posterior (or prior) distributionposterior_predictive_chains::Chains
: Samples from the posterior (or prior) predictive distributionobserved_data::Vector
: The observed data values
Keywords
kind::Symbol
(default::density
): Type of plot -:density
,:histogram
,:scatter
, or:cumulative
alpha::Real
(default:0.2
for density/cumulative,0.7
for scatter): Transparency of predictive curvesnum_pp_samples::Integer
(default: all samples): Number of predictive samples to plotmean_pp::Bool
(default:true
): Whether to plot the mean of predictive distributionobserved::Bool
(default:true
for posterior,false
for prior): Whether to plot observed dataobserved_rug::Bool
(default:false
): Whether to add a rug plot for observed data (kde/cumulative only)colors::Vector
(default:[:steelblue, :black, :orange]
): Colors for [predictive, observed, mean_pp]jitter::Real
(default:0.0
,0.7
for scatter with ≤5 samples): Jitter amount for scatter plotslegend::Bool
(default:true
): Whether to show legendrandom_seed::Integer
(default:nothing
): Random seed for reproducible subsamplingppc_group::Symbol
(default::posterior
): Specify:posterior
or:prior
for appropriate defaults and labeling
Examples
# Posterior Predictive Check
ppcplot(posterior_chains, posterior_predictive_chains, observed_data)
# Prior Predictive Check (observed data not shown by default)
ppcplot(prior_chains, prior_predictive_chains, observed_data; ppc_group=:prior)
# Histogram
ppcplot(chains, pp_chains, observed_data; kind=:histogram)
# Cumulative distribution
ppcplot(chains, pp_chains, observed_data; kind=:cumulative)
# Scatter plot with jitter
ppcplot(chains, pp_chains, observed_data; kind=:scatter, jitter=0.5)
# Prior check with observed data shown
ppcplot(prior_chains, pp_chains, observed_data; ppc_group=:prior, observed=true)
# Subset of predictive samples with custom colors
ppcplot(chains, pp_chains, observed_data;
num_pp_samples=20,
colors=[:blue, :red, :green],
random_seed=42)
Notes
The ppc_group
parameter controls default behavior:
:posterior
: Shows observed data by default, uses "Posterior Predictive Check" title:prior
: Hides observed data by default, uses "Prior Predictive Check" title
MCMCChains.ppcplot!
— Functionppcplot(posterior_chains::Chains, posterior_predictive_chains::Chains, observed_data::Vector; kwargs...)
Generate a posterior/prior predictive check (PPC) plot comparing observed data with predictive samples.
PPC plots are a key tool for model validation in Bayesian analysis. They help assess whether the model can reproduce the key features of the observed data by comparing the observed data against samples from the posterior (or prior) predictive distribution.
Arguments
posterior_chains::Chains
: MCMC samples from the posterior (or prior) distributionposterior_predictive_chains::Chains
: Samples from the posterior (or prior) predictive distributionobserved_data::Vector
: The observed data values
Keywords
kind::Symbol
(default::density
): Type of plot -:density
,:histogram
,:scatter
, or:cumulative
alpha::Real
(default:0.2
for density/cumulative,0.7
for scatter): Transparency of predictive curvesnum_pp_samples::Integer
(default: all samples): Number of predictive samples to plotmean_pp::Bool
(default:true
): Whether to plot the mean of predictive distributionobserved::Bool
(default:true
for posterior,false
for prior): Whether to plot observed dataobserved_rug::Bool
(default:false
): Whether to add a rug plot for observed data (kde/cumulative only)colors::Vector
(default:[:steelblue, :black, :orange]
): Colors for [predictive, observed, mean_pp]jitter::Real
(default:0.0
,0.7
for scatter with ≤5 samples): Jitter amount for scatter plotslegend::Bool
(default:true
): Whether to show legendrandom_seed::Integer
(default:nothing
): Random seed for reproducible subsamplingppc_group::Symbol
(default::posterior
): Specify:posterior
or:prior
for appropriate defaults and labeling
Examples
# Posterior Predictive Check
ppcplot(posterior_chains, posterior_predictive_chains, observed_data)
# Prior Predictive Check (observed data not shown by default)
ppcplot(prior_chains, prior_predictive_chains, observed_data; ppc_group=:prior)
# Histogram
ppcplot(chains, pp_chains, observed_data; kind=:histogram)
# Cumulative distribution
ppcplot(chains, pp_chains, observed_data; kind=:cumulative)
# Scatter plot with jitter
ppcplot(chains, pp_chains, observed_data; kind=:scatter, jitter=0.5)
# Prior check with observed data shown
ppcplot(prior_chains, pp_chains, observed_data; ppc_group=:prior, observed=true)
# Subset of predictive samples with custom colors
ppcplot(chains, pp_chains, observed_data;
num_pp_samples=20,
colors=[:blue, :red, :green],
random_seed=42)
Notes
The ppc_group
parameter controls default behavior:
:posterior
: Shows observed data by default, uses "Posterior Predictive Check" title:prior
: Hides observed data by default, uses "Prior Predictive Check" title
ppcplot(posterior_chains::Chains, posterior_predictive_chains::Chains, observed_data::Vector; kwargs...)
Generate a posterior/prior predictive check (PPC) plot comparing observed data with predictive samples.
PPC plots are a key tool for model validation in Bayesian analysis. They help assess whether the model can reproduce the key features of the observed data by comparing the observed data against samples from the posterior (or prior) predictive distribution.
Arguments
posterior_chains::Chains
: MCMC samples from the posterior (or prior) distributionposterior_predictive_chains::Chains
: Samples from the posterior (or prior) predictive distributionobserved_data::Vector
: The observed data values
Keywords
kind::Symbol
(default::density
): Type of plot -:density
,:histogram
,:scatter
, or:cumulative
alpha::Real
(default:0.2
for density/cumulative,0.7
for scatter): Transparency of predictive curvesnum_pp_samples::Integer
(default: all samples): Number of predictive samples to plotmean_pp::Bool
(default:true
): Whether to plot the mean of predictive distributionobserved::Bool
(default:true
for posterior,false
for prior): Whether to plot observed dataobserved_rug::Bool
(default:false
): Whether to add a rug plot for observed data (kde/cumulative only)colors::Vector
(default:[:steelblue, :black, :orange]
): Colors for [predictive, observed, mean_pp]jitter::Real
(default:0.0
,0.7
for scatter with ≤5 samples): Jitter amount for scatter plotslegend::Bool
(default:true
): Whether to show legendrandom_seed::Integer
(default:nothing
): Random seed for reproducible subsamplingppc_group::Symbol
(default::posterior
): Specify:posterior
or:prior
for appropriate defaults and labeling
Examples
# Posterior Predictive Check
ppcplot(posterior_chains, posterior_predictive_chains, observed_data)
# Prior Predictive Check (observed data not shown by default)
ppcplot(prior_chains, prior_predictive_chains, observed_data; ppc_group=:prior)
# Histogram
ppcplot(chains, pp_chains, observed_data; kind=:histogram)
# Cumulative distribution
ppcplot(chains, pp_chains, observed_data; kind=:cumulative)
# Scatter plot with jitter
ppcplot(chains, pp_chains, observed_data; kind=:scatter, jitter=0.5)
# Prior check with observed data shown
ppcplot(prior_chains, pp_chains, observed_data; ppc_group=:prior, observed=true)
# Subset of predictive samples with custom colors
ppcplot(chains, pp_chains, observed_data;
num_pp_samples=20,
colors=[:blue, :red, :green],
random_seed=42)
Notes
The ppc_group
parameter controls default behavior:
:posterior
: Shows observed data by default, uses "Posterior Predictive Check" title:prior
: Hides observed data by default, uses "Prior Predictive Check" title
MCMCChains.ridgelineplot
— Functionridgelineplot(chains::Chains[, params::Vector{Symbol}]; kwargs...)
Generate a ridgeline plot for the samples of the parameters params
in chains
.
By default, all parameters are plotted.
Keyword arguments
The following options are available:
fill_q
(default:false
) andfill_hpd
(default:true
): Fill the area below the curve in the quantiles interval (fill_q = true
) or the highest posterior density (HPD) interval (fill_hpd = true
). If bothfill_q = false
andfill_hpd = false
, then the whole area below the curve is filled. If no fill color is desired, it should be specified with series attributes. These options are mutually exclusive.show_mean
(default:true
) andshow_median
(default:true
): Plot a vertical line of the mean (show_mean = true
) or median (show_median = true
) of the posterior density estimate. If both options are set totrue
, both lines are plotted.show_qi
(default:false
) andshow_hpdi
(default:true
): Plot a quantile interval (show_qi = true
) or the largest HPD interval (show_hpdi = true
) at the bottom of each density plot. These options are mutually exclusive.q
(default:[0.1, 0.9]
): The two quantiles used for plotting iffill_q = true
orshow_qi = true
.hpd_val
(default:[0.05, 0.2]
): The complementary probability mass(es) of the highest posterior density intervals that are plotted iffill_hpd = true
orshow_hpdi = true
.
MCMCChains.ridgelineplot!
— Functionridgelineplot(chains::Chains[, params::Vector{Symbol}]; kwargs...)
Generate a ridgeline plot for the samples of the parameters params
in chains
.
By default, all parameters are plotted.
Keyword arguments
The following options are available:
fill_q
(default:false
) andfill_hpd
(default:true
): Fill the area below the curve in the quantiles interval (fill_q = true
) or the highest posterior density (HPD) interval (fill_hpd = true
). If bothfill_q = false
andfill_hpd = false
, then the whole area below the curve is filled. If no fill color is desired, it should be specified with series attributes. These options are mutually exclusive.show_mean
(default:true
) andshow_median
(default:true
): Plot a vertical line of the mean (show_mean = true
) or median (show_median = true
) of the posterior density estimate. If both options are set totrue
, both lines are plotted.show_qi
(default:false
) andshow_hpdi
(default:true
): Plot a quantile interval (show_qi = true
) or the largest HPD interval (show_hpdi = true
) at the bottom of each density plot. These options are mutually exclusive.q
(default:[0.1, 0.9]
): The two quantiles used for plotting iffill_q = true
orshow_qi = true
.hpd_val
(default:[0.05, 0.2]
): The complementary probability mass(es) of the highest posterior density intervals that are plotted iffill_hpd = true
orshow_hpdi = true
.
ridgelineplot(chains::Chains[, params::Vector{Symbol}]; kwargs...)
Generate a ridgeline plot for the samples of the parameters params
in chains
.
By default, all parameters are plotted.
Keyword arguments
The following options are available:
fill_q
(default:false
) andfill_hpd
(default:true
): Fill the area below the curve in the quantiles interval (fill_q = true
) or the highest posterior density (HPD) interval (fill_hpd = true
). If bothfill_q = false
andfill_hpd = false
, then the whole area below the curve is filled. If no fill color is desired, it should be specified with series attributes. These options are mutually exclusive.show_mean
(default:true
) andshow_median
(default:true
): Plot a vertical line of the mean (show_mean = true
) or median (show_median = true
) of the posterior density estimate. If both options are set totrue
, both lines are plotted.show_qi
(default:false
) andshow_hpdi
(default:true
): Plot a quantile interval (show_qi = true
) or the largest HPD interval (show_hpdi = true
) at the bottom of each density plot. These options are mutually exclusive.q
(default:[0.1, 0.9]
): The two quantiles used for plotting iffill_q = true
orshow_qi = true
.hpd_val
(default:[0.05, 0.2]
): The complementary probability mass(es) of the highest posterior density intervals that are plotted iffill_hpd = true
orshow_hpdi = true
.
MCMCChains.forestplot
— Functionforestplot(chains::Chains[, params::Vector{Symbol}]; kwargs...)
Generate a forest or caterpillar plot for the samples of the parameters params
in chains
.
By default, all parameters are plotted.
Keyword arguments
ordered
(default:false
): Ifordered = false
, a forest plot is generated. Ifordered = true
, a caterpillar plot is generated.fill_q
(default:false
) andfill_hpd
(default:true
): Fill the area below the curve in the quantiles interval (fill_q = true
) or the highest posterior density (HPD) interval (fill_hpd = true
). If bothfill_q = false
andfill_hpd = false
, then the whole area below the curve is filled. If no fill color is desired, it should be specified with series attributes. These options are mutually exclusive.show_mean
(default:true
) andshow_median
(default:true
): Plot a vertical line of the mean (show_mean = true
) or median (show_median = true
) of the posterior density estimate. If both options are set totrue
, both lines are plotted.show_qi
(default:false
) andshow_hpdi
(default:true
): Plot a quantile interval (show_qi = true
) or the largest HPD interval (show_hpdi = true
) at the bottom of each density plot. These options are mutually exclusive.q
(default:[0.1, 0.9]
): The two quantiles used for plotting iffill_q = true
orshow_qi = true
.hpd_val
(default:[0.05, 0.2]
): The complementary probability mass(es) of the highest posterior density intervals that are plotted iffill_hpd = true
orshow_hpdi = true
.
MCMCChains.forestplot!
— Functionforestplot(chains::Chains[, params::Vector{Symbol}]; kwargs...)
Generate a forest or caterpillar plot for the samples of the parameters params
in chains
.
By default, all parameters are plotted.
Keyword arguments
ordered
(default:false
): Ifordered = false
, a forest plot is generated. Ifordered = true
, a caterpillar plot is generated.fill_q
(default:false
) andfill_hpd
(default:true
): Fill the area below the curve in the quantiles interval (fill_q = true
) or the highest posterior density (HPD) interval (fill_hpd = true
). If bothfill_q = false
andfill_hpd = false
, then the whole area below the curve is filled. If no fill color is desired, it should be specified with series attributes. These options are mutually exclusive.show_mean
(default:true
) andshow_median
(default:true
): Plot a vertical line of the mean (show_mean = true
) or median (show_median = true
) of the posterior density estimate. If both options are set totrue
, both lines are plotted.show_qi
(default:false
) andshow_hpdi
(default:true
): Plot a quantile interval (show_qi = true
) or the largest HPD interval (show_hpdi = true
) at the bottom of each density plot. These options are mutually exclusive.q
(default:[0.1, 0.9]
): The two quantiles used for plotting iffill_q = true
orshow_qi = true
.hpd_val
(default:[0.05, 0.2]
): The complementary probability mass(es) of the highest posterior density intervals that are plotted iffill_hpd = true
orshow_hpdi = true
.
forestplot(chains::Chains[, params::Vector{Symbol}]; kwargs...)
Generate a forest or caterpillar plot for the samples of the parameters params
in chains
.
By default, all parameters are plotted.
Keyword arguments
ordered
(default:false
): Ifordered = false
, a forest plot is generated. Ifordered = true
, a caterpillar plot is generated.fill_q
(default:false
) andfill_hpd
(default:true
): Fill the area below the curve in the quantiles interval (fill_q = true
) or the highest posterior density (HPD) interval (fill_hpd = true
). If bothfill_q = false
andfill_hpd = false
, then the whole area below the curve is filled. If no fill color is desired, it should be specified with series attributes. These options are mutually exclusive.show_mean
(default:true
) andshow_median
(default:true
): Plot a vertical line of the mean (show_mean = true
) or median (show_median = true
) of the posterior density estimate. If both options are set totrue
, both lines are plotted.show_qi
(default:false
) andshow_hpdi
(default:true
): Plot a quantile interval (show_qi = true
) or the largest HPD interval (show_hpdi = true
) at the bottom of each density plot. These options are mutually exclusive.q
(default:[0.1, 0.9]
): The two quantiles used for plotting iffill_q = true
orshow_qi = true
.hpd_val
(default:[0.05, 0.2]
): The complementary probability mass(es) of the highest posterior density intervals that are plotted iffill_hpd = true
orshow_hpdi = true
.