LGMLFeb 19, 2020

Being Bayesian about Categorical Probability

arXiv:2002.07965v269 citations
AI Analysis

This addresses the problem of uncertainty estimation and model calibration in classification tasks for machine learning practitioners, offering a plug-and-play solution with minimal computational overhead.

The paper tackles the overconfidence and lack of uncertainty representation in neural networks using softmax by proposing a Bayesian alternative that models categorical probability as a random variable, resulting in consistent gains in generalization performance across multiple challenging tasks.

Neural networks utilize the softmax as a building block in classification tasks, which contains an overconfidence problem and lacks an uncertainty representation ability. As a Bayesian alternative to the softmax, we consider a random variable of a categorical probability over class labels. In this framework, the prior distribution explicitly models the presumed noise inherent in the observed label, which provides consistent gains in generalization performance in multiple challenging tasks. The proposed method inherits advantages of Bayesian approaches that achieve better uncertainty estimation and model calibration. Our method can be implemented as a plug-and-play loss function with negligible computational overhead compared to the softmax with the cross-entropy loss function.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes