LG NA CO MLNov 2, 2019

Laplacian Smoothing Stochastic Gradient Markov Chain Monte Carlo

Bao Wang, Difan Zou, Quanquan Gu, Stanley Osher

arXiv:1911.00782v16.09 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses slow convergence in MCMC methods for Bayesian machine learning, offering an incremental improvement with theoretical guarantees and empirical validation.

The paper tackles the slow convergence of stochastic gradient Langevin dynamics (SGLD) in Bayesian learning by proposing LS-SGLD, which uses Laplacian smoothing to reduce variance and achieves strictly smaller discretization error in 2-Wasserstein distance for log-concave and non-log-concave densities, with experiments showing superior performance on tasks like posterior sampling and Bayesian neural networks.

As an important Markov Chain Monte Carlo (MCMC) method, stochastic gradient Langevin dynamics (SGLD) algorithm has achieved great success in Bayesian learning and posterior sampling. However, SGLD typically suffers from slow convergence rate due to its large variance caused by the stochastic gradient. In order to alleviate these drawbacks, we leverage the recently developed Laplacian Smoothing (LS) technique and propose a Laplacian smoothing stochastic gradient Langevin dynamics (LS-SGLD) algorithm. We prove that for sampling from both log-concave and non-log-concave densities, LS-SGLD achieves strictly smaller discretization error in $2$-Wasserstein distance, although its mixing rate can be slightly slower. Experiments on both synthetic and real datasets verify our theoretical results, and demonstrate the superior performance of LS-SGLD on different machine learning tasks including posterior sampling, Bayesian logistic regression and training Bayesian convolutional neural networks. The code is available at \url{https://github.com/BaoWangMath/LS-MCMC}.

View on arXiv PDF Code

Similar