ML LGOct 2, 2023

If there is no underfitting, there is no Cold Posterior Effect

Yijie Zhang, Yi-Shan Wu, Luis A. Ortega, Andrés R. Masegosa

arXiv:2310.01189v17.42 citationsh-index: 19

Originality Synthesis-oriented

AI Analysis

This clarifies a key condition for CPE in Bayesian models, addressing a theoretical gap for researchers in Bayesian deep learning, but it is incremental as it builds on prior work on model misspecification.

The paper investigates the cold posterior effect (CPE) in Bayesian deep learning, showing that it occurs only when the Bayesian posterior underfits, and theoretically proves that without underfitting, CPE does not exist.

The cold posterior effect (CPE) (Wenzel et al., 2020) in Bayesian deep learning shows that, for posteriors with a temperature $T<1$, the resulting posterior predictive could have better performances than the Bayesian posterior ($T=1$). As the Bayesian posterior is known to be optimal under perfect model specification, many recent works have studied the presence of CPE as a model misspecification problem, arising from the prior and/or from the likelihood function. In this work, we provide a more nuanced understanding of the CPE as we show that misspecification leads to CPE only when the resulting Bayesian posterior underfits. In fact, we theoretically show that if there is no underfitting, there is no CPE.

View on arXiv PDF

Similar