LGAINEDec 23, 2024

Pretraining with random noise for uncertainty calibration

arXiv:2412.17411v22 citationsh-index: 22
Originality Incremental advance
AI Analysis

This addresses uncertainty calibration for machine learning models, which is crucial for applications like out-of-distribution detection, but the method is incremental as it builds on standard initialization practices.

The paper tackles the problem of uncertainty calibration in machine learning by identifying random initialization as a cause of miscalibration, and shows that pretraining with random noise reduces overconfidence and improves calibration, aligning confidence with accuracy during training.

Uncertainty calibration is crucial for various machine learning applications, yet it remains challenging. Many models exhibit hallucinations - confident yet inaccurate responses - due to miscalibrated confidence. Here, we show that the common practice of random initialization in deep learning, often considered a standard technique, is an underlying cause of this miscalibration, leading to excessively high confidence in untrained networks. Our method, inspired by developmental neuroscience, addresses this issue by simply pretraining networks with random noise and labels, reducing overconfidence and bringing initial confidence levels closer to chance. This ensures optimal calibration, aligning confidence with accuracy during subsequent data training, without the need for additional pre- or post-processing. Pre-calibrated networks excel at identifying "unknown data," showing low confidence for out-of-distribution inputs, thereby resolving confidence miscalibration.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes