Introducing an Improved Information-Theoretic Measure of Predictive Uncertainty
This work addresses a critical issue for practitioners using ML models in real-world decision-making, offering an incremental improvement over existing uncertainty measures.
The paper tackles the problem of quantifying predictive uncertainty in machine learning models, showing that the standard entropy-based measure is flawed and introducing a new theoretically grounded measure that performs better in synthetic tasks and on ImageNet.
Applying a machine learning model for decision-making in the real world requires to distinguish what the model knows from what it does not. A critical factor in assessing the knowledge of a model is to quantify its predictive uncertainty. Predictive uncertainty is commonly measured by the entropy of the Bayesian model average (BMA) predictive distribution. Yet, the properness of this current measure of predictive uncertainty was recently questioned. We provide new insights regarding those limitations. Our analyses show that the current measure erroneously assumes that the BMA predictive distribution is equivalent to the predictive distribution of the true model that generated the dataset. Consequently, we introduce a theoretically grounded measure to overcome these limitations. We experimentally verify the benefits of our introduced measure of predictive uncertainty. We find that our introduced measure behaves more reasonably in controlled synthetic tasks. Moreover, our evaluations on ImageNet demonstrate that our introduced measure is advantageous in real-world applications utilizing predictive uncertainty.