Main Content

diagnostics

Class: HamiltonianSampler

Markov Chain Monte Carlo diagnostics

Syntax

tbl = diagnostics(smp,chains)
tbl = diagnostics(smp,chains,'MaxLag',maxlag)

Description

tbl = diagnostics(smp,chains) returns Markov Chain Monte Carlo diagnostics for the chains in chains.

tbl = diagnostics(smp,chains,'MaxLag',maxlag) specifies the maximum number of autocorrelation lags to use for computing effective sample sizes.

Input Arguments

expand all

Hamiltonian Monte Carlo sampler, specified as a HamiltonianSampler object.

Use the hmcSampler function to create a sampler.

MCMC chains, specified as one of the following:

  • A matrix, where each row is a sample and each column a parameter.

  • A cell array of matrices, where the chain chains{i} is a matrix where each row is a sample and each column a parameter.

The number of parameters (that is, matrix columns) must equal the number of elements of the StartPoint property of the smp sampler.

Maximum number of autocorrelation lags for computing effective sample sizes, specified as a positive integer.

The effective sample size calculation uses lags of 1,2,...,maxlag for each chain in chains that has more than maxlag samples.

For chains with maxlag or fewer samples, the calculation uses Ni - 1 lags, where Ni is the number of samples of chain i.

Example: 'MaxLag',50

Output Arguments

expand all

MCMC diagnostics, computed using all the chains in chains and returned as a table with these columns.

ColumnDescription
Name

Variable name

Mean

Posterior mean estimate

MCSE

Estimate of the Monte Carlo standard error (the standard deviation of the posterior mean estimate)

SD

Estimate of the posterior standard deviation

Q5

Estimate of the 5th quantile of the marginal posterior distribution

Q95

Estimate of the 95th quantile of the marginal posterior distribution

ESS

Effective sample size for the posterior mean estimate. Larger effective sample sizes lead to more accurate results. If the samples are independent, then the effective sample size is equal to the number of samples.

RHat

Gelman-Rubin convergence statistic. As a rule of thumb, values of RHat less than 1.1 are interpreted as a sign that the chains have converged to the target distribution. If RHat for any variable is larger than 1.1, try drawing more Monte Carlo samples.

Examples

expand all

Create MCMC chains using a Hamiltonian Monte Carlo (HMC) sampler and compute MCMC diagnostics.

First, save a function on the MATLAB® path that returns the multivariate normal log probability density and its gradient. In this example, that function is called normalDistGrad and is defined at the end of the example. Then, call this function with arguments to define the logpdf input argument to the hmcSampler function.

means = [1;-2;2];
standevs = [1;2;0.5];
logpdf = @(theta)normalDistGrad(theta,means,standevs);

Choose a starting point. Create the HMC sampler and tune its parameters.

startpoint = randn(3,1);
smp = hmcSampler(logpdf,startpoint);
smp = tuneSampler(smp);

Draw samples from the posterior density, using a few independent chains. Choose different, randomly distributed starting points for each chain. Specify the number of burn-in samples to discard from the beginning of the Markov chain and the number of samples to generate after the burn-in.

NumChains  = 4;
chains     = cell(NumChains,1);
Burnin     = 500;
NumSamples = 1000;
for c = 1:NumChains
    chains{c} = drawSamples(smp,'Burnin',Burnin,'NumSamples',NumSamples,...
        'Start',randn(size(startpoint)));
end

Compute MCMC diagnostics and display the results. Compare the true means in means with the column titled Mean in the MCMCdiagnostics table. The true posterior means are within a few Monte Carlo standard errors (MCSEs) of the estimated posterior means. The HMC sampler has accurately recovered the true means. Similarly, the estimated standard deviations in the column SD are very near the true standard deviations in standev.

MCMCdiagnostics = diagnostics(smp,chains)
MCMCdiagnostics=3×8 table
     Name      Mean        MCSE        SD          Q5        Q95       ESS      RHat
    ______    _______    ________    _______    ________    ______    ______    ____

    {'x1'}     1.0038    0.016474    0.96164    -0.58601     2.563    3407.4     1  
    {'x2'}    -2.0435    0.034933      1.999     -5.3476    1.1851    3274.5     1  
    {'x3'}     1.9957    0.008209    0.49693      1.2036    2.8249    3664.5     1  

means
means = 3×1

     1
    -2
     2

standevs
standevs = 3×1

    1.0000
    2.0000
    0.5000

The normalDistGrad function returns the logarithm of the multivariate normal probability density with means in Mu and standard deviations in Sigma, specified as scalars or columns vectors the same length as the startpoint. The second output argument is the corresponding gradient.

function [lpdf,glpdf] = normalDistGrad(X,Mu,Sigma)
Z = (X - Mu)./Sigma;
lpdf = sum(-log(Sigma) - .5*log(2*pi) - .5*(Z.^2));
glpdf = -Z./Sigma;
end

Tips

Version History

Introduced in R2017a