Noise Contrastive Priors for Functional Uncertainty
This addresses the problem of unreliable uncertainty estimates in Bayesian neural networks for machine learning practitioners, offering a scalable solution that is incremental over prior methods.
The paper tackles the challenge of obtaining reliable uncertainty estimates in neural networks by proposing noise contrastive priors (NCPs), which train models to output high uncertainty for out-of-distribution data, resulting in improved performance on tasks like active learning and scalability on datasets such as flight delays.
Obtaining reliable uncertainty estimates of neural network predictions is a long standing challenge. Bayesian neural networks have been proposed as a solution, but it remains open how to specify their prior. In particular, the common practice of an independent normal prior in weight space imposes relatively weak constraints on the function posterior, allowing it to generalize in unforeseen ways on inputs outside of the training distribution. We propose noise contrastive priors (NCPs) to obtain reliable uncertainty estimates. The key idea is to train the model to output high uncertainty for data points outside of the training distribution. NCPs do so using an input prior, which adds noise to the inputs of the current mini batch, and an output prior, which is a wide distribution given these inputs. NCPs are compatible with any model that can output uncertainty estimates, are easy to scale, and yield reliable uncertainty estimates throughout training. Empirically, we show that NCPs prevent overfitting outside of the training distribution and result in uncertainty estimates that are useful for active learning. We demonstrate the scalability of our method on the flight delays data set, where we significantly improve upon previously published results.