Entropy methods for the confidence assessment of probabilistic classification models
This work addresses the need for better confidence evaluation in classification models, though it appears incremental as it builds on existing methods.
The paper tackles the problem of assessing confidence in probabilistic classification models by utilizing discarded probability distribution information, and provides a theoretical explanation for confidence degradation in the complement approach to Naive Bayes.
Many classification models produce a probability distribution as the outcome of a prediction. This information is generally compressed down to the single class with the highest associated probability. In this paper, we argue that part of the information that is discarded in this process can be in fact used to further evaluate the goodness of models, and in particular the confidence with which each prediction is made. As an application of the ideas presented in this paper, we provide a theoretical explanation of a confidence degradation phenomenon observed in the complement approach to the (Bernoulli) Naive Bayes generative model.