Learning for Single-Shot Confidence Calibration in Deep Neural Networks through Stochastic Inferences
This work addresses the overconfidence issue in deep neural networks, which is crucial for reliable AI applications, though it is incremental as it builds on existing stochastic regularization techniques like dropout and stochastic depth.
The paper tackles the problem of overconfidence in deep neural networks by proposing a variance-weighted confidence-integrated loss function that calibrates prediction confidence using stochastic inferences, achieving significant improvements in calibration performance and classification accuracy across multiple models and datasets.
We propose a generic framework to calibrate accuracy and confidence of a prediction in deep neural networks through stochastic inferences. We interpret stochastic regularization using a Bayesian model, and analyze the relation between predictive uncertainty of networks and variance of the prediction scores obtained by stochastic inferences for a single example. Our empirical study shows that the accuracy and the score of a prediction are highly correlated with the variance of multiple stochastic inferences given by stochastic depth or dropout. Motivated by this observation, we design a novel variance-weighted confidence-integrated loss function that is composed of two cross-entropy loss terms with respect to ground-truth and uniform distribution, which are balanced by variance of stochastic prediction scores. The proposed loss function enables us to learn deep neural networks that predict confidence calibrated scores using a single inference. Our algorithm presents outstanding confidence calibration performance and improves classification accuracy when combined with two popular stochastic regularization techniques---stochastic depth and dropout---in multiple models and datasets; it alleviates overconfidence issue in deep neural networks significantly by training networks to achieve prediction accuracy proportional to confidence of prediction.