Revisiting Reweighted Risk for Calibration: AURC, Focal, and Inverse Focal Loss
This work addresses model calibration for machine learning practitioners, offering an incremental improvement through a flexible, efficient optimization method.
The paper tackles the unclear theoretical connections between reweighted risk functionals and calibration errors by establishing a principled link to selective classification, showing that optimizing selective risk in low-confidence regions improves calibration with competitive performance across datasets and architectures.
Several variants of reweighted risk functionals, such as focal loss, inverse focal loss, and the Area Under the Risk--Coverage Curve (AURC), have been proposed for improving model calibration, yet their theoretical connections to calibration errors remain unclear. In this paper, we revisit a broad class of weighted risk functions commonly used in deep learning and establish a principled connection between calibration error and selective classification. We show that minimizing calibration error is closely linked to the selective classification paradigm and demonstrate that optimizing selective risk in low-confidence region naturally leads to improved calibration. This loss shares a similar reweighting strategy with dual focal loss but offers greater flexibility through the choice of confidence score functions (CSFs). Our approach uses a bin-based cumulative distribution function (CDF) approximation, enabling efficient gradient-based optimization without requiring expensive sorting and achieving $O(nK)$ complexity. Empirical evaluations demonstrate that our method achieves competitive calibration performance across a range of datasets and model architectures.