LGAICVMEMLFeb 11, 2019

Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning

arXiv:1902.03932v2296 citations
Originality Incremental advance
AI Analysis

This addresses the problem of efficient Bayesian inference for deep neural networks, offering a scalable method for practitioners, though it is incremental as it builds on existing SG-MCMC techniques.

The paper tackles the challenge of exploring high-dimensional, multimodal posteriors in Bayesian deep learning by developing Cyclical Stochastic Gradient MCMC, which uses a cyclical stepsize schedule to discover and characterize modes, and demonstrates scalability and effectiveness on datasets like ImageNet.

The posteriors over neural network weights are high dimensional and multimodal. Each mode typically characterizes a meaningfully different representation of the data. We develop Cyclical Stochastic Gradient MCMC (SG-MCMC) to automatically explore such distributions. In particular, we propose a cyclical stepsize schedule, where larger steps discover new modes, and smaller steps characterize each mode. We also prove non-asymptotic convergence of our proposed algorithm. Moreover, we provide extensive experimental results, including ImageNet, to demonstrate the scalability and effectiveness of cyclical SG-MCMC in learning complex multimodal distributions, especially for fully Bayesian inference with modern deep neural networks.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes