A Simulated Annealing Approach to Bayesian Inference
This provides a method for Bayesian inference in stochastic models where likelihoods are intractable, addressing a bottleneck for researchers in fields like computational statistics or machine learning, though it appears incremental as an adaptation of simulated annealing to this context.
The paper tackles Bayesian parameter inference for stochastic models by introducing a simulated annealing algorithm that propagates particles in parameter-output space, using Metropolis steps and adaptive temperature reduction to approximate the posterior distribution without calculating likelihood densities. It achieves this by minimizing entropy production, with optimal annealing schedules derived for cases with no prior knowledge.
A generic algorithm for the extraction of probabilistic (Bayesian) information about model parameters from data is presented. The algorithm propagates an ensemble of particles in the product space of model parameters and outputs. Each particle update consists of a random jump in parameter space followed by a simulation of a model output and a Metropolis acceptance/rejection step based on a comparison of the simulated output to the data. The distance of a particle to the data is interpreted as an energy and the algorithm is reducing the associated temperature of the ensemble such that entropy production is minimized. If this simulated annealing is not too fast compared to the mixing speed in parameter space, the parameter marginal of the ensemble approaches the Bayesian posterior distribution. Annealing is adaptive and depends on certain extensive thermodynamic quantities that can easily be measured throughout run-time. In the general case, we propose annealing with a constant entropy production rate, which is optimal as long as annealing is not too fast. For the practically relevant special case of no prior knowledge, we derive an optimal fast annealing schedule with a non-constant entropy production rate. The algorithm does not require the calculation of the density of the model likelihood, which makes it interesting for Bayesian parameter inference with stochastic models, whose likelihood functions are typically very high dimensional integrals.