MLLGOct 2, 2023

If there is no underfitting, there is no Cold Posterior Effect

arXiv:2310.01189v12 citationsh-index: 19
Originality Synthesis-oriented
AI Analysis

This clarifies a key condition for CPE in Bayesian models, addressing a theoretical gap for researchers in Bayesian deep learning, but it is incremental as it builds on prior work on model misspecification.

The paper investigates the cold posterior effect (CPE) in Bayesian deep learning, showing that it occurs only when the Bayesian posterior underfits, and theoretically proves that without underfitting, CPE does not exist.

The cold posterior effect (CPE) (Wenzel et al., 2020) in Bayesian deep learning shows that, for posteriors with a temperature $T<1$, the resulting posterior predictive could have better performances than the Bayesian posterior ($T=1$). As the Bayesian posterior is known to be optimal under perfect model specification, many recent works have studied the presence of CPE as a model misspecification problem, arising from the prior and/or from the likelihood function. In this work, we provide a more nuanced understanding of the CPE as we show that misspecification leads to CPE only when the resulting Bayesian posterior underfits. In fact, we theoretically show that if there is no underfitting, there is no CPE.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes