LGMLFeb 8, 2024

On Temperature Scaling and Conformal Prediction of Deep Classifiers

arXiv:2402.05806v411 citationsh-index: 17ICML
Originality Incremental advance
AI Analysis

This addresses the need for reliable confidence indications in classification applications, offering insights for practitioners using adaptive conformal prediction methods, though it is incremental in exploring existing techniques.

The paper investigates the interplay between temperature scaling calibration and conformal prediction for deep classifiers, showing that while calibration improves class-conditional coverage, it negatively affects prediction set sizes, and provides guidelines for practitioners to balance these trade-offs.

In many classification applications, the prediction of a deep neural network (DNN) based classifier needs to be accompanied by some confidence indication. Two popular approaches for that aim are: 1) Calibration: modifies the classifier's softmax values such that the maximal value better estimates the correctness probability; and 2) Conformal Prediction (CP): produces a prediction set of candidate labels that contains the true label with a user-specified probability, guaranteeing marginal coverage but not, e.g., per class coverage. In practice, both types of indications are desirable, yet, so far the interplay between them has not been investigated. Focusing on the ubiquitous Temperature Scaling (TS) calibration, we start this paper with an extensive empirical study of its effect on prominent CP methods. We show that while TS calibration improves the class-conditional coverage of adaptive CP methods, surprisingly, it negatively affects their prediction set sizes. Motivated by this behavior, we explore the effect of TS on CP beyond its calibration application and reveal an intriguing trend under which it allows to trade prediction set size and conditional coverage of adaptive CP methods. Then, we establish a mathematical theory that explains the entire non-monotonic trend. Finally, based on our experiments and theory, we offer simple guidelines for practitioners to effectively combine adaptive CP with calibration, aligned with user-defined goals.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes