LG PL MLDec 10, 2018

Bayesian Layers: A Module for Neural Network Uncertainty

Dustin Tran, Michael W. Dusenberry, Mark van der Wilk, Danijar Hafner

arXiv:1812.03973v324.1144 citations

Originality Incremental advance

AI Analysis

This work addresses the problem of integrating uncertainty quantification into neural networks for researchers and practitioners, though it is incremental as it builds on existing methods like Bayesian neural nets and dropout.

The authors tackled the challenge of enabling fast experimentation with neural network uncertainty by introducing Bayesian Layers, a module that provides drop-in replacements for common layers in neural network libraries, and demonstrated its application by fitting a 5-billion parameter Bayesian Transformer on 512 TPUv2 cores for machine translation uncertainty.

We describe Bayesian Layers, a module designed for fast experimentation with neural network uncertainty. It extends neural network libraries with drop-in replacements for common layers. This enables composition via a unified abstraction over deterministic and stochastic functions and allows for scalability via the underlying system. These layers capture uncertainty over weights (Bayesian neural nets), pre-activation units (dropout), activations ("stochastic output layers"), or the function itself (Gaussian processes). They can also be reversible to propagate uncertainty from input to output. We include code examples for common architectures such as Bayesian LSTMs, deep GPs, and flow-based models. As demonstration, we fit a 5-billion parameter "Bayesian Transformer" on 512 TPUv2 cores for uncertainty in machine translation and a Bayesian dynamics model for model-based planning. Finally, we show how Bayesian Layers can be used within the Edward2 probabilistic programming language for probabilistic programs with stochastic processes.

View on arXiv PDF

Similar