Improving Multi-Class Calibration through Normalization-Aware Isotonic Techniques
This work addresses the need for reliable probability predictions in multi-class supervised learning, offering incremental improvements over existing methods for practitioners in fields like text and image classification.
The paper tackled the problem of suboptimal multi-class calibration with isotonic regression by proposing normalization-aware techniques, resulting in consistent improvements in negative log-likelihood and expected calibration error across text and image classification datasets.
Accurate and reliable probability predictions are essential for multi-class supervised learning tasks, where well-calibrated models enable rational decision-making. While isotonic regression has proven effective for binary calibration, its extension to multi-class problems via one-vs-rest calibration produced suboptimal results when compared to parametric methods, limiting its practical adoption. In this work, we propose novel isotonic normalization-aware techniques for multiclass calibration, grounded in natural and intuitive assumptions expected by practitioners. Unlike prior approaches, our methods inherently account for probability normalization by either incorporating normalization directly into the optimization process (NA-FIR) or modeling the problem as a cumulative bivariate isotonic regression (SCIR). Empirical evaluation on a variety of text and image classification datasets across different model architectures reveals that our approach consistently improves negative log-likelihood (NLL) and expected calibration error (ECE) metrics.