On the Theory of Variance Reduction for Stochastic Gradient Monte Carlo
This work addresses the challenge of improving sampling efficiency in Bayesian inference for researchers in machine learning and statistics, offering incremental theoretical advancements.
The paper tackled the problem of analyzing convergence guarantees for variance-reduction methods in stochastic gradient Monte Carlo, providing sharp theoretical bounds in Wasserstein distance under smooth, strongly convex, and Hessian Lipschitz assumptions, with results verified on real-world and synthetic datasets.
We provide convergence guarantees in Wasserstein distance for a variety of variance-reduction methods: SAGA Langevin diffusion, SVRG Langevin diffusion and control-variate underdamped Langevin diffusion. We analyze these methods under a uniform set of assumptions on the log-posterior distribution, assuming it to be smooth, strongly convex and Hessian Lipschitz. This is achieved by a new proof technique combining ideas from finite-sum optimization and the analysis of sampling methods. Our sharp theoretical bounds allow us to identify regimes of interest where each method performs better than the others. Our theory is verified with experiments on real-world and synthetic datasets.