Calibration of Neural Networks using Splines
This work addresses the need for reliable probability estimates in safety-critical decision-making, representing an incremental improvement over prior calibration techniques.
The authors tackled the problem of calibrating neural network probabilities for safety-critical applications by introducing a binning-free calibration measure based on the Kolmogorov-Smirnov test and using splines to approximate cumulative distributions, resulting in a recalibration function that consistently outperformed existing methods on various image classification datasets.
Calibrating neural networks is of utmost importance when employing them in safety-critical applications where the downstream decision making depends on the predicted probabilities. Measuring calibration error amounts to comparing two empirical distributions. In this work, we introduce a binning-free calibration measure inspired by the classical Kolmogorov-Smirnov (KS) statistical test in which the main idea is to compare the respective cumulative probability distributions. From this, by approximating the empirical cumulative distribution using a differentiable function via splines, we obtain a recalibration function, which maps the network outputs to actual (calibrated) class assignment probabilities. The spine-fitting is performed using a held-out calibration set and the obtained recalibration function is evaluated on an unseen test set. We tested our method against existing calibration approaches on various image classification datasets and our spline-based recalibration approach consistently outperforms existing methods on KS error as well as other commonly used calibration measures. Our Code is available at https://github.com/kartikgupta-at-anu/spline-calibration.