CO MLJul 16, 2019

Stochastic gradient Markov chain Monte Carlo

arXiv:1907.06986v122.1171 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This is an incremental review that addresses scalability issues in Bayesian inference for practitioners dealing with large datasets.

The paper tackles the computational cost of Markov chain Monte Carlo (MCMC) for large datasets by introducing stochastic gradient MCMC (SGMCMC), which uses data subsampling to reduce per-iteration costs, and reviews its algorithms and theoretical results while comparing efficiency against MCMC on benchmarks.

Markov chain Monte Carlo (MCMC) algorithms are generally regarded as the gold standard technique for Bayesian inference. They are theoretically well-understood and conceptually simple to apply in practice. The drawback of MCMC is that in general performing exact inference requires all of the data to be processed at each iteration of the algorithm. For large data sets, the computational cost of MCMC can be prohibitive, which has led to recent developments in scalable Monte Carlo algorithms that have a significantly lower computational cost than standard MCMC. In this paper, we focus on a particular class of scalable Monte Carlo algorithms, stochastic gradient Markov chain Monte Carlo (SGMCMC) which utilises data subsampling techniques to reduce the per-iteration cost of MCMC. We provide an introduction to some popular SGMCMC algorithms and review the supporting theoretical results, as well as comparing the efficiency of SGMCMC algorithms against MCMC on benchmark examples. The supporting R code is available online.

View on arXiv PDF Code

Similar