MLLGMEJun 12, 2023

Riemannian Laplace approximations for Bayesian neural networks

arXiv:2306.07158v117 citationsh-index: 30
Originality Highly original
AI Analysis

This addresses a practical bottleneck in Bayesian deep learning for researchers and practitioners by offering a more robust approximation method.

The paper tackles the problem of approximating weight-posteriors in Bayesian neural networks, which are often non-Gaussian, by proposing a Riemannian Laplace approximation that adapts to the posterior shape, resulting in consistent improvements over conventional methods and reduced sensitivity to prior choice.

Bayesian neural networks often approximate the weight-posterior with a Gaussian distribution. However, practical posteriors are often, even locally, highly non-Gaussian, and empirical performance deteriorates. We propose a simple parametric approximate posterior that adapts to the shape of the true posterior through a Riemannian metric that is determined by the log-posterior gradient. We develop a Riemannian Laplace approximation where samples naturally fall into weight-regions with low negative log-posterior. We show that these samples can be drawn by solving a system of ordinary differential equations, which can be done efficiently by leveraging the structure of the Riemannian metric and automatic differentiation. Empirically, we demonstrate that our approach consistently improves over the conventional Laplace approximation across tasks. We further show that, unlike the conventional Laplace approximation, our method is not overly sensitive to the choice of prior, which alleviates a practical pitfall of current approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes