LGMLOct 27, 2018

Attended Temperature Scaling: A Practical Approach for Calibrating Deep Neural Networks

arXiv:1810.11586v341 citations
Originality Incremental advance
AI Analysis

This work addresses calibration issues in deep neural networks for critical decision-making applications like autonomous driving and medical diagnosis, but it is incremental as it builds upon the existing Temperature Scaling method.

The paper tackled the problem of deep neural networks being poorly calibrated, especially when using Temperature Scaling (TS) with small or noisy validation sets or highly accurate networks, and proposed Attended Temperature Scaling (ATS) which improved calibration in these challenging situations, achieving better results on various models and datasets, including a skin lesion detection application.

Recently, Deep Neural Networks (DNNs) have been achieving impressive results on wide range of tasks. However, they suffer from being well-calibrated. In decision-making applications, such as autonomous driving or medical diagnosing, the confidence of deep networks plays an important role to bring the trust and reliability to the system. To calibrate the deep networks' confidence, many probabilistic and measure-based approaches are proposed. Temperature Scaling (TS) is a state-of-the-art among measure-based calibration methods which has low time and memory complexity as well as effectiveness. In this paper, we study TS and show it does not work properly when the validation set that TS uses for calibration has small size or contains noisy-labeled samples. TS also cannot calibrate highly accurate networks as well as non-highly accurate ones. Accordingly, we propose Attended Temperature Scaling (ATS) which preserves the advantages of TS while improves calibration in aforementioned challenging situations. We provide theoretical justifications for ATS and assess its effectiveness on wide range of deep models and datasets. We also compare the calibration results of TS and ATS on skin lesion detection application as a practical problem where well-calibrated system can play important role in making a decision.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes