Ex uno plures: Splitting One Model into an Ensemble of Subnetworks
This addresses the problem of improving ensemble performance and uncertainty calibration in deep learning for practitioners, offering a more efficient alternative to compute-intensive methods.
The paper tackled the performance gap between Monte Carlo dropout and deep ensembles by proposing a method to split a single model into an ensemble of independently trained subnetworks, achieving accuracy and uncertainty estimates comparable to deep ensembles with computational efficiency similar to MC dropout, as demonstrated on datasets like CIFAR10/100, CUB200, and Tiny-Imagenet.
Monte Carlo (MC) dropout is a simple and efficient ensembling method that can improve the accuracy and confidence calibration of high-capacity deep neural network models. However, MC dropout is not as effective as more compute-intensive methods such as deep ensembles. This performance gap can be attributed to the relatively poor quality of individual models in the MC dropout ensemble and their lack of diversity. These issues can in turn be traced back to the coupled training and substantial parameter sharing of the dropout models. Motivated by this perspective, we propose a strategy to compute an ensemble of subnetworks, each corresponding to a non-overlapping dropout mask computed via a pruning strategy and trained independently. We show that the proposed subnetwork ensembling method can perform as well as standard deep ensembles in both accuracy and uncertainty estimates, yet with a computational efficiency similar to MC dropout. Lastly, using several computer vision datasets like CIFAR10/100, CUB200, and Tiny-Imagenet, we experimentally demonstrate that subnetwork ensembling also consistently outperforms recently proposed approaches that efficiently ensemble neural networks.