Markov-Chain Monte-Carlo
The Markov-Chain-Monte-Carlo algorithm (MCMC) is closely related to active learning but candidate points are sampled from a distribution and either accepted or rejected according to their posterior probability rather than actively learned. Furthermore, MCMC focuses on finding the highest posterior distribution of the parameters in place of optimizing a loss function as in active learning.
Given observed data and a simulation model with certain (physical) parameters, an MCMC simulation estimates the best posterior distribution of these parameters.
The ingredients to an MCMC algorithm are the following:
- Data
Observed (experimental) data. Which path is given in the configuration as parameter
reference_data
.
- Model simulation
Usual simulation to model the observed data with parameter vector \(\theta\).
- Sampling algorithm
Method, how next input points to the simulation are chosen. In proFit the Metropolis-Hastings algorithm is used.
- Likelihood function
Function to compare candidate points with already sampled points. In proFit, a Gaussian likelihood is chosen.
As the initial MCMC point starts at a random position in search space, warmup
cycles are necessary to reach the desired area of high posterior probabilities. A
widely used target acceptance rate of accepted versus rejected points is the well
known asymptotically optimal target acceptance rate of \(0.234\). This value has to be
treated with caution, as there may be significantly better suited acceptance rates for a
particular problem. The target acceptance rate can be adjusted in the configuration with the
parameter target_acceptance_rate
.
Given a specific target acceptance rate, the step size of how far in some direction in parameter
space the next point is chosen, is adapted throughout the warmup cycles. After all warmup cycles are
complete, the warmup points are discarded and the actual MCMC run, which yields the final posterior
distribution with uncertainties for the parameters, is executed. The config parameter last_percent
specifies which fraction of the main MCMC run is used for calculating the mean and variance of the posterior distribution.
This lasts normally until a convergence criterion is satisfied or the
maximum number of MCMC runs is reached. The log-likelihoods of each step are saved to
./log_likelihohod.txt
and the parameter’s final mean and standard deviation are saved to ./mcmc_stats.txt
.
A special feature of the proFit MCMC algorithm is delayed acceptance (DA),
which uses a surrogate model of the loss function to estimate its expected minimum
and can thus reject prospected MCMC points by the surrogate model, already
before an (expensive) simulation run is started, which can reduce computation time.
As of now, DA can be utilized only after the first warmup cycle. Before, points are accepted
or rejected only by running the simulation. It is planned to use the SimpleAL
active learning algorithm to also sample the first points intelligently.
Delayed acceptance can be activated in the configuration by setting the parameter
delayed_acceptance
.
Results
Setting the parameter make_plot
in the active_learning
section plots the
MCMC results for each warmup cycle and the main MCMC loop.
Examples
ntrain: 1000 # Total number of training points.
...
active_learning:
nwarmup: 50 # Number of warmup points per cycle
algorithm:
class: mcmc
reference_data: ./experimental_data.txt
warmup_cycles: 5
target_acceptance_rate: 0.35
sigma_n: 0.05 # Estimated data noise (standard deviation)
initial_points: [0.5, 1] # Starting points for each dimension. If None, randomly chosen in search space.
last_percent: 0.25 # Last points used to calculate mean and variance of posterior.
save: ./mcmc_model.hdf5 # Save MCMC model path.
delayed_acceptance: True # Use delayed acceptance with surrogate specified in `fit` configuration.