LG AIMay 23, 2024

Credal Wrapper of Model Averaging for Uncertainty Estimation in Classification

Kaizheng Wang, Fabio Cuzzolin, Keivan Shariatmadar, David Moens, Hans Hallez

arXiv:2405.15047v214.28 citationsh-index: 9ICLR

Originality Incremental advance

AI Analysis

This work addresses uncertainty estimation for classification tasks, particularly in out-of-distribution detection, but appears incremental as it builds on existing BNN and DE methods.

The paper tackles uncertainty estimation in classification by proposing a credal wrapper method for Bayesian neural networks and deep ensembles, which improves out-of-distribution detection and reduces expected calibration error on corrupted data compared to baselines.

This paper presents an innovative approach, called credal wrapper, to formulating a credal set representation of model averaging for Bayesian neural networks (BNNs) and deep ensembles (DEs), capable of improving uncertainty estimation in classification tasks. Given a finite collection of single predictive distributions derived from BNNs or DEs, the proposed credal wrapper approach extracts an upper and a lower probability bound per class, acknowledging the epistemic uncertainty due to the availability of a limited amount of distributions. Such probability intervals over classes can be mapped on a convex set of probabilities (a credal set) from which, in turn, a unique prediction can be obtained using a transformation called intersection probability transformation. In this article, we conduct extensive experiments on several out-of-distribution (OOD) detection benchmarks, encompassing various dataset pairs (CIFAR10/100 vs SVHN/Tiny-ImageNet, CIFAR10 vs CIFAR10-C, CIFAR100 vs CIFAR100-C and ImageNet vs ImageNet-O) and using different network architectures (such as VGG16, ResNet-18/50, EfficientNet B2, and ViT Base). Compared to the BNN and DE baselines, the proposed credal wrapper method exhibits superior performance in uncertainty estimation and achieves a lower expected calibration error on corrupted data.

View on arXiv PDF

Similar