MLLGCOOct 19, 2020

A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions

arXiv:2010.09800v235 citations
AI Analysis

This work addresses the challenge of efficient Bayesian inference for big data, particularly in training deep neural networks, by providing a scalable dynamic importance sampler that improves convergence in non-convex settings.

The authors tackled the problem of sampling from multi-modal distributions in Bayesian learning by proposing the contour stochastic gradient Langevin dynamics (CSGLD) algorithm, which automatically flattens the target distribution to facilitate simulations, and demonstrated its superiority in avoiding local traps on datasets like CIFAR10 and CIFAR100.

We propose an adaptively weighted stochastic gradient Langevin dynamics algorithm (SGLD), so-called contour stochastic gradient Langevin dynamics (CSGLD), for Bayesian learning in big data statistics. The proposed algorithm is essentially a \emph{scalable dynamic importance sampler}, which automatically \emph{flattens} the target distribution such that the simulation for a multi-modal distribution can be greatly facilitated. Theoretically, we prove a stability condition and establish the asymptotic convergence of the self-adapting parameter to a {\it unique fixed-point}, regardless of the non-convexity of the original energy function; we also present an error analysis for the weighted averaging estimators. Empirically, the CSGLD algorithm is tested on multiple benchmark datasets including CIFAR10 and CIFAR100. The numerical results indicate its superiority to avoid the local trap problem in training deep neural networks.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes