LGMLJul 19, 2021

Structured Stochastic Gradient MCMC

arXiv:2107.09028v414 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge for practitioners in Bayesian deep learning who need efficient and accurate inference without restrictive assumptions, though it is incremental as it builds on SGMCMC and variational methods.

The paper tackles the speed-accuracy tradeoff in Bayesian inference for large-scale models by proposing a non-parametric variational approximation that avoids strong assumptions on the posterior's functional form, resulting in improved convergence speed and/or final accuracy on datasets like CIFAR-10, SVHN, and FMNIST compared to existing methods.

Stochastic gradient Markov Chain Monte Carlo (SGMCMC) is considered the gold standard for Bayesian inference in large-scale models, such as Bayesian neural networks. Since practitioners face speed versus accuracy tradeoffs in these models, variational inference (VI) is often the preferable option. Unfortunately, VI makes strong assumptions on both the factorization and functional form of the posterior. In this work, we propose a new non-parametric variational approximation that makes no assumptions about the approximate posterior's functional form and allows practitioners to specify the exact dependencies the algorithm should respect or break. The approach relies on a new Langevin-type algorithm that operates on a modified energy function, where parts of the latent variables are averaged over samples from earlier iterations of the Markov chain. This way, statistical dependencies can be broken in a controlled way, allowing the chain to mix faster. This scheme can be further modified in a "dropout" manner, leading to even more scalability. We test our scheme for ResNet-20 on CIFAR-10, SVHN, and FMNIST. In all cases, we find improvements in convergence speed and/or final accuracy compared to SG-MCMC and VI.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes