Wasserstein Dropout
This work addresses the problem of accurate and stable uncertainty quantification for neural networks, which is crucial for safe machine learning, particularly for practitioners relying on these systems.
This paper introduces Wasserstein dropout, a purely non-parametric method for uncertainty quantification in neural networks for regression tasks. It minimizes the Wasserstein distance between label and model distributions, and empirically outperforms state-of-the-art methods in producing more accurate and stable uncertainty estimates on both vanilla and distributionally shifted test data.
Despite of its importance for safe machine learning, uncertainty quantification for neural networks is far from being solved. State-of-the-art approaches to estimate neural uncertainties are often hybrid, combining parametric models with explicit or implicit (dropout-based) ensembling. We take another pathway and propose a novel approach to uncertainty quantification for regression tasks, Wasserstein dropout, that is purely non-parametric. Technically, it captures aleatoric uncertainty by means of dropout-based sub-network distributions. This is accomplished by a new objective which minimizes the Wasserstein distance between the label distribution and the model distribution. An extensive empirical analysis shows that Wasserstein dropout outperforms state-of-the-art methods, on vanilla test data as well as under distributional shift, in terms of producing more accurate and stable uncertainty estimates.