Error-Driven Uncertainty Aware Training
This work addresses the reliability and trustworthiness of neural classifiers, particularly in image recognition, by providing a method to enhance uncertainty estimation, which is crucial for safe deployment in real-world applications.
The paper tackles the problem of neural networks being overconfident in their predictions by introducing Error-Driven Uncertainty Aware Training (EUAT), a technique that improves uncertainty estimation by minimizing uncertainty for correct predictions and maximizing it for incorrect ones, resulting in higher quality uncertainty estimates and better performance in trust classification and under distribution shifts compared to existing methods.
Neural networks are often overconfident about their predictions, which undermines their reliability and trustworthiness. In this work, we present a novel technique, named Error-Driven Uncertainty Aware Training (EUAT), which aims to enhance the ability of neural classifiers to estimate their uncertainty correctly, namely to be highly uncertain when they output inaccurate predictions and low uncertain when their output is accurate. The EUAT approach operates during the model's training phase by selectively employing two loss functions depending on whether the training examples are correctly or incorrectly predicted by the model. This allows for pursuing the twofold goal of i) minimizing model uncertainty for correctly predicted inputs and ii) maximizing uncertainty for mispredicted inputs, while preserving the model's misprediction rate. We evaluate EUAT using diverse neural models and datasets in the image recognition domains considering both non-adversarial and adversarial settings. The results show that EUAT outperforms existing approaches for uncertainty estimation (including other uncertainty-aware training techniques, calibration, ensembles, and DEUP) by providing uncertainty estimates that not only have higher quality when evaluated via statistical metrics (e.g., correlation with residuals) but also when employed to build binary classifiers that decide whether the model's output can be trusted or not and under distributional data shifts.