Postprocessor

postprocessor ¶

Postprocessing for parallel MTMLDA runs.

The postprocessing routines evaluate and visualize statistics and render trees for parallel MTMLDA runs. All data can be saved for reproducibility of plots.

Classes:

Name	Description
`PostprocessorSettings`	Dataclass to store postprocessing settings.
`Postprocessor`	Postprocessor for parallel MTMLDA runs.

mtmlda.run.postprocessor.PostprocessorSettings `dataclass` ¶

Dataclass to store postprocessing settings.

Attributes:

Name	Type	Description
`chain_directory`	`Path`	Path Directory containing the chains in npy format. If `None`, no postprocessing is performed
`tree_directory`	`Path`	Path Directory containing exported Markov trees in dot format. If `None`, no rendering is performed.
`output_data_directory`	`Path`	Path Directory to save statistics data to. If `None`, no data is saved.
`visualization_directory`	`Path`	Path Directory to save visualizations to. If `None`, no visualizations are generated.
`acf_max_lag`	`int`	int Maximum lag for autocorrelation function computation. Chains need to be longer than this value.

mtmlda.run.postprocessor.Postprocessor ¶

Postprocessor for parallel MTMLDA runs.

Given a set of directories, the postprocessor looks for chain data, evaluates statistics, visualizes these statistics, and renders potentially created Markov trees.

Statistics: - Component-wise autocorrelation - Component-wise effective sample size - COmponent-wise potential scale reduction factor

Visualizations: - 1D marginal densities - Pairwise sample distributions - ESS over sample size - PSRF over sample size - Component-wise ACFs - Markov trees

init ¶

__init__(postprocessor_settings: PostprocessorSettings) -> None

Constructor of the Postprocessor.

Reads in settings and loads chain data.

Parameters:

Name	Type	Description	Default
`postprocessor_settings`	`PostprocessorSettings`	Settings class for postprocessing	required

run ¶

run() -> None

Main method of the postprocessor.

Depending on which paths are provided, the postprocessor evaluates statistics, saves data, generates visualizations, and renders trees.

_load_chain_data ¶

_load_chain_data(chain_directory: Path) -> list[np.ndarray]

Loads chain data from specified directory.

All npy files are interpreted as chains. Chains need to be 2D arrays, where the first dimension corresponds to the number of samples and the second dimension to the number of components. Chains of different length are NOT allowed.

Parameters:

Name	Type	Description	Default
`chain_directory`	`Path`	Directory of raw chain data.	required

Raises:

Type	Description
`FileNotFoundError`	Checks that all chains have correct format.

Returns:

Type	Description
`list[np.ndarray]`	list[np.ndarray]: Samples ordered by component.

_compute_marginal_kdes ¶

_compute_marginal_kdes() -> list

Compute 1D marginal densities through KDE.

Note that the samples from all chains are concatenated, including those that might be discarded as burn-in.

Returns:

Name	Type	Description
`list`	`list`	1D marginal densities for each component.

_compute_autocorrelation ¶

_compute_autocorrelation() -> list

Computes ACFs for each chain and component.

Useful for assessing mixing of chains.

Returns:

Name	Type	Description
`list`	`list`	List of lists containing ACFs for each component and chain.

_compute_effective_sample_size ¶

_compute_effective_sample_size() -> list

Evaluates the ESS per component.

Evaluation is started at a sample size of 4 and is increased in steps of 1% of the total sample size per chain.

Returns:

Name	Type	Description
`list`	`list`	ESS for all components.

_compute_psrf ¶

_compute_psrf() -> list

Computes the potential scale reduction factor (PSRF) for each component.

Evaluation is started at a sample size of 4 and is increased in steps of 1% of the total sample size per chain.

_save_data ¶

_save_data(
    marginal_densities: list, autocorrelations: list, ess_data: tuple, psrf_data: tuple
) -> None

Save statistics data for reproducibility of plots.

Parameters:

Name	Type	Description	Default
`marginal_densities`	`list`	1D Marginal densities from KDE.	required
`autocorrelations`	`list`	ACFs for all components.	required
`ess_data`	`tuple`	ESS for all components.	required
`psrf_data`	`tuple`	PSRF for all components.	required

_visualize_traces ¶

_visualize_traces(components: list) -> None

Visualize ACFs for all components.

Parameters:

Name	Type	Description	Default
`components`	`list`	Sample collections of all components.	required

_visualize_marginal_densities ¶

_visualize_marginal_densities(marginal_densities: list) -> None

Visualize 1D Marginals for each component.

Parameters:

Name	Type	Description	Default
`marginal_densities`	`list`	1D densities, estimated via KDE.	required

_visualize_autocorrelation ¶

_visualize_autocorrelation(autocorrelations: list) -> None

Visualize ACFs for all components.

Parameters:

Name	Type	Description	Default
`autocorrelations`	`list`	Computed ACFs for all components.	required

_visualize_effective_sample_size ¶

_visualize_effective_sample_size(ess_data: tuple) -> None

Visualize ESS for every component.

Parameters:

Name	Type	Description	Default
`ess_data`	`tuple`	ESS for all components.	required

_visualize_psrf ¶

_visualize_psrf(psrf_data: tuple) -> None

Visualize PSRF for every component.

Parameters:

Name	Type	Description	Default
`psrf_data`	`tuple`	PSRF for all components.	required

_visualize_pairwise ¶

_visualize_pairwise() -> None

Visualize all pairwise sample distribution in an all-vs-all fashion.

_render_dot_files ¶

_render_dot_files(tree_directory: Path) -> None

Render tree files with Pydot.

Note: Rendering trees is quite time-consuming.

Parameters:

Name	Type	Description	Default
`tree_directory`	`Path`	Path to stored dot files. Rendered tree images are stored in the same directory.	required

Postprocessor

postprocessor ¶

mtmlda.run.postprocessor.PostprocessorSettings dataclass ¶

mtmlda.run.postprocessor.Postprocessor ¶

__init__ ¶

run ¶

_load_chain_data ¶

_compute_marginal_kdes ¶

_compute_autocorrelation ¶

_compute_effective_sample_size ¶

_compute_psrf ¶

_save_data ¶

_visualize_traces ¶

_visualize_marginal_densities ¶

_visualize_autocorrelation ¶

_visualize_effective_sample_size ¶

_visualize_psrf ¶

_visualize_pairwise ¶

_render_dot_files ¶

mtmlda.run.postprocessor.PostprocessorSettings `dataclass` ¶

init ¶