LGJan 23

Is BatchEnsemble a Single Model? On Calibration and Diversity of Efficient Ensembles

Anton Zamyatin, Patrick Indri, Sagar Malhotra, Thomas Gärtner

arXiv:2601.16936v13.82 citationsh-index: 4

Originality Synthesis-oriented

AI Analysis

This work addresses the need for efficient uncertainty estimation in resource-constrained settings, revealing that BatchEnsemble is incremental and may not deliver the expected benefits for practitioners relying on ensembles.

The paper tackled the problem of whether BatchEnsemble provides effective ensemble-like uncertainty estimates, finding that it underperforms Deep Ensembles and behaves similarly to a single model in accuracy, calibration, and out-of-distribution detection on datasets like CIFAR10/10C/SVHN, with members showing near-identical functions and parameters in a controlled MNIST study.

In resource-constrained and low-latency settings, uncertainty estimates must be efficiently obtained. Deep Ensembles provide robust epistemic uncertainty (EU) but require training multiple full-size models. BatchEnsemble aims to deliver ensemble-like EU at far lower parameter and memory cost by applying learned rank-1 perturbations to a shared base network. We show that BatchEnsemble not only underperforms Deep Ensembles but closely tracks a single model baseline in terms of accuracy, calibration and out-of-distribution (OOD) detection on CIFAR10/10C/SVHN. A controlled study on MNIST finds members are near-identical in function and parameter space, indicating limited capacity to realize distinct predictive modes. Thus, BatchEnsemble behaves more like a single model than a true ensemble.

View on arXiv PDF

Similar