LGCOMar 9, 2023

Scalable Stochastic Gradient Riemannian Langevin Dynamics in Non-Diagonal Metrics

arXiv:2303.05101v49 citationsh-index: 25
Originality Incremental advance
AI Analysis

This work addresses computational bottlenecks in Bayesian inference for neural networks, offering incremental improvements over existing methods.

The authors tackled the problem of inefficient posterior exploration in stochastic-gradient sampling for Bayesian neural networks by proposing two non-diagonal metrics that improve convergence with minimal computational overhead, showing improvements for fully connected and convolutional neural networks with specific priors.

Stochastic-gradient sampling methods are often used to perform Bayesian inference on neural networks. It has been observed that the methods in which notions of differential geometry are included tend to have better performances, with the Riemannian metric improving posterior exploration by accounting for the local curvature. However, the existing methods often resort to simple diagonal metrics to remain computationally efficient. This loses some of the gains. We propose two non-diagonal metrics that can be used in stochastic-gradient samplers to improve convergence and exploration but have only a minor computational overhead over diagonal metrics. We show that for fully connected neural networks (NNs) with sparsity-inducing priors and convolutional NNs with correlated priors, using these metrics can provide improvements. For some other choices the posterior is sufficiently easy also for the simpler metrics.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes