LGMLSep 11, 2023

The fine print on tempered posteriors

arXiv:2309.05292v14 citationsh-index: 18
Originality Incremental advance
AI Analysis

This work addresses the practical implications of Bayesian model tuning for machine learning practitioners, revealing counterintuitive results about temperature parameters.

The paper investigates tempered posteriors and finds that stochasticity often does not improve test accuracy, with the coldest temperature being optimal, and that calibration gains come at the cost of accuracy degradation.

We conduct a detailed investigation of tempered posteriors and uncover a number of crucial and previously undiscussed points. Contrary to previous results, we first show that for realistic models and datasets and the tightly controlled case of the Laplace approximation to the posterior, stochasticity does not in general improve test accuracy. The coldest temperature is often optimal. One might think that Bayesian models with some stochasticity can at least obtain improvements in terms of calibration. However, we show empirically that when gains are obtained this comes at the cost of degradation in test accuracy. We then discuss how targeting Frequentist metrics using Bayesian models provides a simple explanation of the need for a temperature parameter $λ$ in the optimization objective. Contrary to prior works, we finally show through a PAC-Bayesian analysis that the temperature $λ$ cannot be seen as simply fixing a misspecified prior or likelihood.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes