Skip to content

Postprocessor

postprocessor

Postprocessing for parallel MTMLDA runs.

The postprocessing routines evaluate and visualize statistics and render trees for parallel MTMLDA runs. All data can be saved for reproducibility of plots.

Classes:

Name Description
PostprocessorSettings

Dataclass to store postprocessing settings.

Postprocessor

Postprocessor for parallel MTMLDA runs.

mtmlda.run.postprocessor.PostprocessorSettings dataclass

Dataclass to store postprocessing settings.

Attributes:

Name Type Description
chain_directory Path

Path Directory containing the chains in npy format. If None, no postprocessing is performed

tree_directory Path

Path Directory containing exported Markov trees in dot format. If None, no rendering is performed.

output_data_directory Path

Path Directory to save statistics data to. If None, no data is saved.

visualization_directory Path

Path Directory to save visualizations to. If None, no visualizations are generated.

acf_max_lag int

int Maximum lag for autocorrelation function computation. Chains need to be longer than this value.

mtmlda.run.postprocessor.Postprocessor

Postprocessor for parallel MTMLDA runs.

Given a set of directories, the postprocessor looks for chain data, evaluates statistics, visualizes these statistics, and renders potentially created Markov trees.

Statistics: - Component-wise autocorrelation - Component-wise effective sample size - COmponent-wise potential scale reduction factor

Visualizations: - 1D marginal densities - Pairwise sample distributions - ESS over sample size - PSRF over sample size - Component-wise ACFs - Markov trees

__init__

__init__(postprocessor_settings: PostprocessorSettings) -> None

Constructor of the Postprocessor.

Reads in settings and loads chain data.

Parameters:

Name Type Description Default
postprocessor_settings PostprocessorSettings

Settings class for postprocessing

required

run

run() -> None

Main method of the postprocessor.

Depending on which paths are provided, the postprocessor evaluates statistics, saves data, generates visualizations, and renders trees.

_load_chain_data

_load_chain_data(chain_directory: Path) -> list[np.ndarray]

Loads chain data from specified directory.

All npy files are interpreted as chains. Chains need to be 2D arrays, where the first dimension corresponds to the number of samples and the second dimension to the number of components. Chains of different length are NOT allowed.

Parameters:

Name Type Description Default
chain_directory Path

Directory of raw chain data.

required

Raises:

Type Description
FileNotFoundError

Checks that all chains have correct format.

Returns:

Type Description
list[np.ndarray]

list[np.ndarray]: Samples ordered by component.

_compute_marginal_kdes

_compute_marginal_kdes() -> list

Compute 1D marginal densities through KDE.

Note that the samples from all chains are concatenated, including those that might be discarded as burn-in.

Returns:

Name Type Description
list list

1D marginal densities for each component.

_compute_autocorrelation

_compute_autocorrelation() -> list

Computes ACFs for each chain and component.

Useful for assessing mixing of chains.

Returns:

Name Type Description
list list

List of lists containing ACFs for each component and chain.

_compute_effective_sample_size

_compute_effective_sample_size() -> list

Evaluates the ESS per component.

Evaluation is started at a sample size of 4 and is increased in steps of 1% of the total sample size per chain.

Returns:

Name Type Description
list list

ESS for all components.

_compute_psrf

_compute_psrf() -> list

Computes the potential scale reduction factor (PSRF) for each component.

Evaluation is started at a sample size of 4 and is increased in steps of 1% of the total sample size per chain.

_save_data

_save_data(
    marginal_densities: list, autocorrelations: list, ess_data: tuple, psrf_data: tuple
) -> None

Save statistics data for reproducibility of plots.

Parameters:

Name Type Description Default
marginal_densities list

1D Marginal densities from KDE.

required
autocorrelations list

ACFs for all components.

required
ess_data tuple

ESS for all components.

required
psrf_data tuple

PSRF for all components.

required

_visualize_traces

_visualize_traces(components: list) -> None

Visualize ACFs for all components.

Parameters:

Name Type Description Default
components list

Sample collections of all components.

required

_visualize_marginal_densities

_visualize_marginal_densities(marginal_densities: list) -> None

Visualize 1D Marginals for each component.

Parameters:

Name Type Description Default
marginal_densities list

1D densities, estimated via KDE.

required

_visualize_autocorrelation

_visualize_autocorrelation(autocorrelations: list) -> None

Visualize ACFs for all components.

Parameters:

Name Type Description Default
autocorrelations list

Computed ACFs for all components.

required

_visualize_effective_sample_size

_visualize_effective_sample_size(ess_data: tuple) -> None

Visualize ESS for every component.

Parameters:

Name Type Description Default
ess_data tuple

ESS for all components.

required

_visualize_psrf

_visualize_psrf(psrf_data: tuple) -> None

Visualize PSRF for every component.

Parameters:

Name Type Description Default
psrf_data tuple

PSRF for all components.

required

_visualize_pairwise

_visualize_pairwise() -> None

Visualize all pairwise sample distribution in an all-vs-all fashion.

_render_dot_files

_render_dot_files(tree_directory: Path) -> None

Render tree files with Pydot.

Note: Rendering trees is quite time-consuming.

Parameters:

Name Type Description Default
tree_directory Path

Path to stored dot files. Rendered tree images are stored in the same directory.

required