Does confidence calibration improve conformal prediction?
This work addresses the problem of inefficient uncertainty quantification in conformal prediction for machine learning practitioners, offering an incremental improvement over existing calibration methods.
The paper investigates whether confidence calibration improves conformal prediction, finding that current methods like temperature scaling often increase prediction set sizes, and proposes Conformal Temperature Scaling (ConfTS) to enhance efficiency, reducing set sizes by up to 15% in experiments.
Conformal prediction is an emerging technique for uncertainty quantification that constructs prediction sets guaranteed to contain the true label with a predefined probability. Previous works often employ temperature scaling to calibrate classifiers, assuming that confidence calibration benefits conformal prediction. However, the specific impact of confidence calibration on conformal prediction remains underexplored. In this work, we make two key discoveries about the impact of confidence calibration methods on adaptive conformal prediction. Firstly, we empirically show that current confidence calibration methods (e.g., temperature scaling) typically lead to larger prediction sets in adaptive conformal prediction. Secondly, by investigating the role of temperature value, we observe that high-confidence predictions can enhance the efficiency of adaptive conformal prediction. Theoretically, we prove that predictions with higher confidence result in smaller prediction sets on expectation. This finding implies that the rescaling parameters in these calibration methods, when optimized with cross-entropy loss, might counteract the goal of generating efficient prediction sets. To address this issue, we propose Conformal Temperature Scaling (ConfTS), a variant of temperature scaling with a novel loss function designed to enhance the efficiency of prediction sets. This approach can be extended to optimize the parameters of other post-hoc methods of confidence calibration. Extensive experiments demonstrate that our method improves existing adaptive conformal prediction methods in classification tasks, especially with LLMs.