Skip to content

Parallel Runner

runner

Chain-Parallel MTMLDA Runner.

The Runner class provides a wrapper for running parallel chains of the MTMLDA sampler. It unifies generic MTMLDA components with application specific components. The latter are provided by an application builder that relies on an interface contract. This approach allows for a wide range applications that can be interfaced to MTMLDA via this runner. The runner constructs all generic and problem-specific components from respective settings files. It further adjust settings for parallel runs depending on the invoking process id, and initializes the chains accordingly. Parallel runs are conducted in a multiprocessing pool.

Classes:

Name Description
ParallelRunSettings

Data class for parallel run settings

ParallelRunner

Runner class for parallel MTMLDA runs

mtmlda.run.runner.ParallelRunSettings dataclass

Data class for parallel run settings, used for the run.py wrapper.

Attributes:

Name Type Description
num_chains int

Number of parallel chains to run

chain_save_path Path

Path to save MCMC chain data, will be appended by process ID

chain_load_path Path

Path to load MCMC chain data, will be appended by process ID

node_save_path Path

Path to save node data, will be appended by process ID

node_load_path Path

Path to load node data, will be appended by process ID

rng_state_save_path Path

Path to save RNG state data, will be appended by process ID

rng_state_load_path Path

Path to load RNG state data, will be appended by process ID

overwrite_chain bool

Overwrite existing chain data

overwrite_node bool

Overwrite existing node data

overwrite_rng_states bool

Overwrite existing RNG state data

mtmlda.run.runner.ParallelRunner

Parllel runner for MTMLDA sampling.

Methods:

Name Description
run

main interface to run parallel sampling

__init__

__init__(
    application_builder: abstract_builder.ApplicationBuilder,
    parallel_run_settings: ParallelRunSettings,
    sampler_setup_settings: sampling.SamplerSetupSettings,
    sampler_run_settings: sampling.SamplerRunSettings,
    logger_settings: logging.LoggerSettings,
    inverse_problem_settings: abstract_builder.InverseProblemSettings,
    sampler_component_settings: abstract_builder.SamplerComponentSettings,
    initial_state_settings: abstract_builder.InitialStateSettings,
) -> None

Constructor of the parallel runner.

Takes in all relevant settings for setup and execution of the parallel MTMLDA.

Parameters:

Name Type Description Default
application_builder abstract_builder.ApplicationBuilder

Builder for the application that delivers problem-specific components for the sampler

required
parallel_run_settings ParallelRunSettings

Settings for parallel sampling of multiple chains

required
sampler_setup_settings sampling.SamplerSetupSettings

Settings for the setup of the MTMLDA sampler object

required
sampler_run_settings sampling.SamplerRunSettings

Settings for starting a run with the MTMLDA sampler object

required
logger_settings logging.LoggerSettings

Settings for logging during sampling

required
inverse_problem_settings abstract_builder.InverseProblemSettings

Settings for the (posterior) distribution setup of the Application constructed by the builder.

required
sampler_component_settings abstract_builder.SamplerComponentSettings

Settings for the sampler component setup provided by the application builder

required
initial_state_settings abstract_builder.InitialStateSettings

Settings for initialization of the Markov chains as conducted by the application builder

required

run

run() -> None

Runs the function _execute_mtmlda_on_procs in a multiprocessing pool.

_execute_mtmlda_on_procs

_execute_mtmlda_on_procs(process_id: int) -> None

Executes parallel MTMLDA chains.

This is the main execution method, which is mapped to the workers of a multiprocessing pool in run. Execution steps are as follows: 1. Modify settings depending on process ID 2. Set up application components 3. Set up sampler components 4. Adjust initialization settings based on process ID 5. Run the sampler 6. Save results to disk

Parameters:

Name Type Description Default
process_id int

Id of the invoking process

required

_modify_process_dependent_settings

_modify_process_dependent_settings(process_id: int) -> None

Modify settings depending on process ID.

Sets up process-specific paths and RNG seeds. Also disables printing for all but the first process.

Parameters:

Name Type Description Default
process_id int

Invoking process

required

_adjust_initialization_settings

_adjust_initialization_settings(
    process_id: int,
    app_builder: abstract_builder.ApplicationBuilder,
    mtmlda_sampler: sampling.MTMLDASampler,
) -> None

Adjust settings for initialization of the Markov chain, depending on the process id.

Every process saves result in a different state, so the initialization of the chain has to take that into account as well. This method is specific to initialization of a chain, after all components have been constructed. Other process-specific settings are adjusted in the _modify_process_dependent_settings method.

Parameters:

Name Type Description Default
process_id int

Id pf the invoking process in parallel runs

required
app_builder abstract_builder.ApplicationBuilder

Builder of the application from which problem-specific components are generated

required
mtmlda_sampler sampling.MTMLDASampler

MTMLDA Sampler Object

required

_save_results

_save_results(
    process_id: int,
    rng_states: sampling.RNGStates,
    mcmc_chain: np.ndarray,
    final_node: mltree.MTNode,
) -> None

Save data from sampling to disk.

All data is stored in process-dependent files. Numeric arrays are stored in npy format, other data is pickled.

Parameters:

Name Type Description Default
process_id int

Id of the invoking process in parallel runs

required
rng_states sampling.RNGStates

RNG states of the Sampler, save for initialization to extend previously sampled chains

required
mcmc_chain np.ndarray

Generated samples

required
final_node mltree.MTNode

Last node in the utilized Markov tree, use to restart sampling from this point

required