MLLGJul 26, 2022

One Simple Trick to Fix Your Bayesian Neural Network

arXiv:2207.13167v11 citationsh-index: 34
Originality Incremental advance
AI Analysis

This addresses a specific bottleneck in Bayesian deep learning for practitioners, but it is incremental as it focuses on activation function choice rather than a fundamental breakthrough.

The paper tackles the problem of mean-field variational inference (MFVI) struggling with ReLU activations in Bayesian neural networks, finding that using Leaky ReLU activations leads to more Gaussian-like posteriors and reduces expected calibration error (ECE).

One of the most popular estimation methods in Bayesian neural networks (BNN) is mean-field variational inference (MFVI). In this work, we show that neural networks with ReLU activation function induce posteriors, that are hard to fit with MFVI. We provide a theoretical justification for this phenomenon, study it empirically, and report the results of a series of experiments to investigate the effect of activation function on the calibration of BNNs. We find that using Leaky ReLU activations leads to more Gaussian-like weight posteriors and achieves a lower expected calibration error (ECE) than its ReLU-based counterpart.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes