MLCVOct 13, 2022

A Consistent and Differentiable Lp Canonical Calibration Error Estimator

arXiv:2210.07810v150 citationsh-index: 42
Originality Incremental advance
AI Analysis

This addresses the need for reliable uncertainty estimates in probabilistic classifiers, particularly for multiclass settings, though it is incremental as it builds on existing calibration concepts with a novel estimator.

The paper tackles the problem of poor calibration in deep neural networks, which output overconfident predictions, by proposing a low-bias, trainable estimator for the Lp canonical calibration error that asymptotically converges to the true error and enables efficient mini-batch updates.

Calibrated probabilistic classifiers are models whose predicted probabilities can directly be interpreted as uncertainty estimates. It has been shown recently that deep neural networks are poorly calibrated and tend to output overconfident predictions. As a remedy, we propose a low-bias, trainable calibration error estimator based on Dirichlet kernel density estimates, which asymptotically converges to the true $L_p$ calibration error. This novel estimator enables us to tackle the strongest notion of multiclass calibration, called canonical (or distribution) calibration, while other common calibration methods are tractable only for top-label and marginal calibration. The computational complexity of our estimator is $\mathcal{O}(n^2)$, the convergence rate is $\mathcal{O}(n^{-1/2})$, and it is unbiased up to $\mathcal{O}(n^{-2})$, achieved by a geometric series debiasing scheme. In practice, this means that the estimator can be applied to small subsets of data, enabling efficient estimation and mini-batch updates. The proposed method has a natural choice of kernel, and can be used to generate consistent estimates of other quantities based on conditional expectation, such as the sharpness of a probabilistic classifier. Empirical results validate the correctness of our estimator, and demonstrate its utility in canonical calibration error estimation and calibration error regularized risk minimization.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes