LG ST MLMay 24, 2024

Information-theoretic Generalization Analysis for Expected Calibration Error

arXiv:2405.15709v213.415 citationsh-index: 6NIPS

Originality Incremental advance

AI Analysis

This work addresses a theoretical gap in evaluating calibration performance for machine learning models, which is crucial for ensuring reliable uncertainty estimates in applications like healthcare and autonomous systems, though it is incremental as it builds on existing binning methods.

The paper tackles the limited theoretical understanding of estimation bias in expected calibration error (ECE) by analyzing two common binning strategies, establishing improved upper bounds on bias and identifying the optimal number of bins to minimize it. It extends this to generalization error analysis, deriving bounds that allow numerical evaluation of ECE for unknown data, with experiments showing nonvacuous results.

While the expected calibration error (ECE), which employs binning, is widely adopted to evaluate the calibration performance of machine learning models, theoretical understanding of its estimation bias is limited. In this paper, we present the first comprehensive analysis of the estimation bias in the two common binning strategies, uniform mass and uniform width binning. Our analysis establishes upper bounds on the bias, achieving an improved convergence rate. Moreover, our bounds reveal, for the first time, the optimal number of bins to minimize the estimation bias. We further extend our bias analysis to generalization error analysis based on the information-theoretic approach, deriving upper bounds that enable the numerical evaluation of how small the ECE is for unknown data. Experiments using deep learning models show that our bounds are nonvacuous thanks to this information-theoretic generalization analysis approach.

View on arXiv PDF

Similar