LG MLFeb 19, 2023

Credal Bayesian Deep Learning

Michele Caprio, Souradeep Dutta, Kuk Jin Jang, Vivian Lin, Radoslav Ivanov, Oleg Sokolsky, Insup Lee

arXiv:2302.09656v525.544 citationsh-index: 68

Originality Incremental advance

AI Analysis

This work addresses uncertainty quantification and robustness issues in machine learning, particularly for safety-critical applications such as autonomous driving and medical systems, though it is incremental as it builds on existing BNN and imprecise probability concepts.

The paper tackles the problem of distinguishing and quantifying aleatoric and epistemic uncertainties in Bayesian Neural Networks (BNNs) by introducing Credal Bayesian Deep Learning (CBDL), which trains an infinite ensemble of BNNs using finitely generated credal sets, resulting in improved robustness to prior/likelihood misspecification and distribution shifts, with demonstrated better performance in uncertainty quantification and downstream tasks like autonomous driving and medical control compared to baseline BNN ensembles.

Uncertainty quantification and robustness to distribution shifts are important goals in machine learning and artificial intelligence. Although Bayesian Neural Networks (BNNs) allow for uncertainty in the predictions to be assessed, different sources of predictive uncertainty cannot be distinguished properly. We present Credal Bayesian Deep Learning (CBDL). Heuristically, CBDL allows to train an (uncountably) infinite ensemble of BNNs, using only finitely many elements. This is possible thanks to prior and likelihood finitely generated credal sets (FGCSs), a concept from the imprecise probability literature. Intuitively, convex combinations of a finite collection of prior-likelihood pairs are able to represent infinitely many such pairs. After training, CBDL outputs a set of posteriors on the parameters of the neural network. At inference time, such posterior set is used to derive a set of predictive distributions that is in turn utilized to distinguish between (predictive) aleatoric and epistemic uncertainties, and to quantify them. The predictive set also produces either (i) a collection of outputs enjoying desirable probabilistic guarantees, or (ii) the single output that is deemed the best, that is, the one having the highest predictive lower probability -- another imprecise-probabilistic concept. CBDL is more robust than single BNNs to prior and likelihood misspecification, and to distribution shift. We show that CBDL is better at quantifying and disentangling different types of (predictive) uncertainties than single BNNs and ensemble of BNNs. In addition, we apply CBDL to two case studies to demonstrate its downstream tasks capabilities: one, for motion prediction in autonomous driving scenarios, and two, to model blood glucose and insulin dynamics for artificial pancreas control. We show that CBDL performs better when compared to an ensemble of BNNs baseline.

View on arXiv PDF

Similar