LGAIMay 29, 2023

Beyond Confidence: Reliable Models Should Also Consider Atypicality

arXiv:2305.18262v229 citations
AI Analysis

This work addresses the need for more reliable uncertainty estimation in machine learning models, particularly for atypical inputs, which is an incremental improvement over existing confidence-based methods.

The paper tackled the problem of insufficient reliability assessment in machine learning predictions by showing that atypicality (rarity) of samples or classes is strongly related to miscalibration and lower accuracy, and demonstrated that incorporating atypicality improves uncertainty quantification and model performance, with a case study showing improved skin lesion classifier performance across skin tone groups without access to group attributes.

While most machine learning models can provide confidence in their predictions, confidence is insufficient to understand a prediction's reliability. For instance, the model may have a low confidence prediction if the input is not well-represented in the training dataset or if the input is inherently ambiguous. In this work, we investigate the relationship between how atypical(rare) a sample or a class is and the reliability of a model's predictions. We first demonstrate that atypicality is strongly related to miscalibration and accuracy. In particular, we empirically show that predictions for atypical inputs or atypical classes are more overconfident and have lower accuracy. Using these insights, we show incorporating atypicality improves uncertainty quantification and model performance for discriminative neural networks and large language models. In a case study, we show that using atypicality improves the performance of a skin lesion classifier across different skin tone groups without having access to the group attributes. Overall, we propose that models should use not only confidence but also atypicality to improve uncertainty quantification and performance. Our results demonstrate that simple post-hoc atypicality estimators can provide significant value.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes