Particle Optimization in Stochastic Gradient MCMC
This work addresses a key limitation in Bayesian learning for large datasets, offering a potentially more efficient sampling approach, though it appears incremental by building on existing SG-MCMC and SVGD frameworks.
The paper tackles the problem of high sample correlation in stochastic gradient Markov chain Monte Carlo (SG-MCMC) by proposing a novel method to directly optimize particles from scratch, showing connections to Stein variational gradient descent (SVGD) and generative adversarial networks, and interpreting it as an extension of SVGD with momentum under certain relaxations.
Stochastic gradient Markov chain Monte Carlo (SG-MCMC) has been increasingly popular in Bayesian learning due to its ability to deal with large data. A standard SG-MCMC algorithm simulates samples from a discretized-time Markov chain to approximate a target distribution. However, the samples are typically highly correlated due to the sequential generation process, an undesired property in SG-MCMC. In contrary, Stein variational gradient descent (SVGD) directly optimizes a set of particles, and it is able to approximate a target distribution with much fewer samples. In this paper, we propose a novel method to directly optimize particles (or samples) in SG-MCMC from scratch. Specifically, we propose efficient methods to solve the corresponding Fokker-Planck equation on the space of probability distributions, whose solution (i.e., a distribution) is approximated by particles. Through our framework, we are able to show connections of SG-MCMC to SVGD, as well as the seemly unrelated generative-adversarial-net framework. Under certain relaxations, particle optimization in SG-MCMC can be interpreted as an extension of standard SVGD with momentum.