LGCVMLOct 13, 2020

Training independent subnetworks for robust prediction

arXiv:2010.06610v2236 citations
Originality Highly original
AI Analysis

This addresses the problem of high computational cost in robust prediction for machine learning practitioners, offering a novel method to achieve ensembling benefits for free.

The paper tackles the computational cost of ensembling neural networks for robustness by training multiple independent subnetworks within a single model using a multi-input multi-output configuration, achieving significant improvements in negative log-likelihood, accuracy, and calibration error on datasets like CIFAR10, CIFAR100, and ImageNet without increasing compute.

Recent approaches to efficiently ensemble neural networks have shown that strong robustness and uncertainty performance can be achieved with a negligible gain in parameters over the original network. However, these methods still require multiple forward passes for prediction, leading to a significant computational cost. In this work, we show a surprising result: the benefits of using multiple predictions can be achieved `for free' under a single model's forward pass. In particular, we show that, using a multi-input multi-output (MIMO) configuration, one can utilize a single model's capacity to train multiple subnetworks that independently learn the task at hand. By ensembling the predictions made by the subnetworks, we improve model robustness without increasing compute. We observe a significant improvement in negative log-likelihood, accuracy, and calibration error on CIFAR10, CIFAR100, ImageNet, and their out-of-distribution variants compared to previous methods.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes