ML LGJul 31, 2020

Cold Posteriors and Aleatoric Uncertainty

Ben Adlam, Jasper Snoek, Samuel L. Smith

arXiv:2008.00029v115.527 citations

Originality Synthesis-oriented

AI Analysis

This addresses the problem of interpreting and justifying temperature tuning in Bayesian inference for machine learning researchers, though it is incremental as it builds on prior observations without introducing a new method.

The paper tackles the 'cold posterior' effect in Bayesian neural networks, where tuning the temperature improves performance over exact inference, by arguing that standard priors overestimate aleatoric uncertainty in high-quality datasets like MNIST or CIFAR, leading to models that better reflect prior beliefs about label reliability.

Recent work has observed that one can outperform exact inference in Bayesian neural networks by tuning the "temperature" of the posterior on a validation set (the "cold posterior" effect). To help interpret this phenomenon, we argue that commonly used priors in Bayesian neural networks can significantly overestimate the aleatoric uncertainty in the labels on many classification datasets. This problem is particularly pronounced in academic benchmarks like MNIST or CIFAR, for which the quality of the labels is high. For the special case of Gaussian process regression, any positive temperature corresponds to a valid posterior under a modified prior, and tuning this temperature is directly analogous to empirical Bayes. On classification tasks, there is no direct equivalence between modifying the prior and tuning the temperature, however reducing the temperature can lead to models which better reflect our belief that one gains little information by relabeling existing examples in the training set. Therefore although cold posteriors do not always correspond to an exact inference procedure, we believe they may often better reflect our true prior beliefs.

View on arXiv PDF

Similar