LGMLFeb 9, 2024

Are Uncertainty Quantification Capabilities of Evidential Deep Learning a Mirage?

arXiv:2402.06160v321 citationsh-index: 33NIPS
Originality Incremental advance
AI Analysis

This work critically assesses a popular uncertainty quantification method in machine learning, highlighting its limitations and suggesting improvements, making it an incremental analysis for researchers in uncertainty estimation.

This paper questions the effectiveness of evidential deep learning (EDL) for uncertainty quantification, revealing that its learned epistemic uncertainties are unreliable and non-vanishing even with infinite data, and concludes that EDL's empirical success occurs despite poor uncertainty quantification.

This paper questions the effectiveness of a modern predictive uncertainty quantification approach, called \emph{evidential deep learning} (EDL), in which a single neural network model is trained to learn a meta distribution over the predictive distribution by minimizing a specific objective function. Despite their perceived strong empirical performance on downstream tasks, a line of recent studies by Bengs et al. identify limitations of the existing methods to conclude their learned epistemic uncertainties are unreliable, e.g., in that they are non-vanishing even with infinite data. Building on and sharpening such analysis, we 1) provide a sharper understanding of the asymptotic behavior of a wide class of EDL methods by unifying various objective functions; 2) reveal that the EDL methods can be better interpreted as an out-of-distribution detection algorithm based on energy-based-models; and 3) conduct extensive ablation studies to better assess their empirical effectiveness with real-world datasets. Through all these analyses, we conclude that even when EDL methods are empirically effective on downstream tasks, this occurs despite their poor uncertainty quantification capabilities. Our investigation suggests that incorporating model uncertainty can help EDL methods faithfully quantify uncertainties and further improve performance on representative downstream tasks, albeit at the cost of additional computational complexity.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes