LG AIJul 17, 2024

Uncertainty Calibration with Energy Based Instance-wise Scaling in the Wild Dataset

arXiv:2407.12330v16.43 citationsh-index: 3Has Code

Originality Incremental advance

AI Analysis

This addresses reliability issues in AI systems for safety-critical applications, but it is incremental as it builds on existing calibration methods.

The paper tackles the problem of deep neural networks lacking uncertainty representation by proposing a robust post-hoc calibration method for multi-class classification, showing consistent performance from in-distribution to out-of-distribution scenarios compared to state-of-the-art methods.

With the rapid advancement in the performance of deep neural networks (DNNs), there has been significant interest in deploying and incorporating artificial intelligence (AI) systems into real-world scenarios. However, many DNNs lack the ability to represent uncertainty, often exhibiting excessive confidence even when making incorrect predictions. To ensure the reliability of AI systems, particularly in safety-critical cases, DNNs should transparently reflect the uncertainty in their predictions. In this paper, we investigate robust post-hoc uncertainty calibration methods for DNNs within the context of multi-class classification tasks. While previous studies have made notable progress, they still face challenges in achieving robust calibration, particularly in scenarios involving out-of-distribution (OOD). We identify that previous methods lack adaptability to individual input data and struggle to accurately estimate uncertainty when processing inputs drawn from the wild dataset. To address this issue, we introduce a novel instance-wise calibration method based on an energy model. Our method incorporates energy scores instead of softmax confidence scores, allowing for adaptive consideration of DNN uncertainty for each prediction within a logit space. In experiments, we show that the proposed method consistently maintains robust performance across the spectrum, spanning from in-distribution to OOD scenarios, when compared to other state-of-the-art methods.

View on arXiv PDF Code

Similar