Scalable Bayesian neural networks by layer-wise input augmentation
This addresses the challenge of scalable uncertainty representation in deep learning for practitioners working with large models.
The paper tackles the impracticality of standard Bayesian neural networks by proposing implicit Bayesian neural networks that augment each layer's inputs with latent variables, achieving state-of-the-art performance in calibration, robustness, and uncertainty characterization on large-scale image classification tasks with multi-million parameters.
We introduce implicit Bayesian neural networks, a simple and scalable approach for uncertainty representation in deep learning. Standard Bayesian approach to deep learning requires the impractical inference of the posterior distribution over millions of parameters. Instead, we propose to induce a distribution that captures the uncertainty over neural networks by augmenting each layer's inputs with latent variables. We present appropriate input distributions and demonstrate state-of-the-art performance in terms of calibration, robustness and uncertainty characterisation over large-scale, multi-million parameter image classification tasks.