ML LG COSep 13, 2019

Adversarial $α$-divergence Minimization for Bayesian Approximate Inference

Simón Rodríguez Santana, Daniel Hernández-Lobato

arXiv:1909.06945v37.78 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the need for uncertainty estimation in neural networks, which is critical for applications like risk-sensitive decision-making, but the method appears incremental as it builds on existing divergence minimization techniques.

The paper tackles the problem of estimating uncertainty in neural network predictions by proposing a method for approximate Bayesian inference based on minimizing α-divergences, which in regression problems often improves test log-likelihood and sometimes squared error, while in classification it yields competitive results.

Neural networks are popular state-of-the-art models for many different tasks.They are often trained via back-propagation to find a value of the weights that correctly predicts the observed data. Although back-propagation has shown good performance in many applications, it cannot easily output an estimate of the uncertainty in the predictions made. Estimating the uncertainty in the predictions is a critical aspect with important applications, and one method to obtain this information is following a Bayesian approach to estimate a posterior distribution on the model parameters. This posterior distribution summarizes which parameter values are compatible with the data, but is usually intractable and has to be approximated. Several mechanisms have been considered for solving this problem. We propose here a general method for approximate Bayesian inference that is based on minimizingα-divergences and that allows for flexible approximate distributions. The method is evaluated in the context of Bayesian neural networks on extensive experiments. The results show that, in regression problems, it often gives better performance in terms of the test log-likelihoodand sometimes in terms of the squared error. In classification problems, however, it gives competitive results.

View on arXiv PDF Code

Similar