Too Sharp, Too Sure: When Calibration Follows Curvature
This addresses the issue of unreliable confidence estimates in neural networks for machine learning practitioners, though it is incremental as it builds on existing calibration and curvature concepts.
The paper tackled the problem of neural networks being poorly calibrated despite high accuracy by studying calibration as a training-time phenomenon, linking it to curvature and margins, and introduced a margin-aware training objective that improved out-of-sample calibration without sacrificing accuracy.
Modern neural networks can achieve high accuracy while remaining poorly calibrated, producing confidence estimates that do not match empirical correctness. Yet calibration is often treated as a post-hoc attribute. We take a different perspective: we study calibration as a training-time phenomenon on small vision tasks, and ask whether calibrated solutions can be obtained reliably by intervening on the training procedure. We identify a tight coupling between calibration, curvature, and margins during training of deep networks under multiple gradient-based methods. Empirically, Expected Calibration Error (ECE) closely tracks curvature-based sharpness throughout optimization. Mathematically, we show that both ECE and Gauss--Newton curvature are controlled, up to problem-specific constants, by the same margin-dependent exponential tail functional along the trajectory. Guided by this mechanism, we introduce a margin-aware training objective that explicitly targets robust-margin tails and local smoothness, yielding improved out-of-sample calibration across optimizers without sacrificing accuracy.