LGJun 16, 2021

A Winning Hand: Compressing Deep Networks Can Improve Out-Of-Distribution Robustness

arXiv:2106.09129v285 citationsHas Code
AI Analysis

This addresses the problem of deploying efficient and reliable deep learning models in real-world scenarios with distributional shifts, offering a novel approach that is not incremental but provides strong empirical gains.

The paper tackles the challenge of creating compact, accurate, and robust deep neural networks (CARDs) by showing that lottery ticket-style pruning and quantization can produce such models with similar test accuracy and matching or better robustness than larger counterparts, achieving state-of-the-art results on CIFAR-10-C (96.8% standard, 92.75% robust) and CIFAR-100-C (80.6% standard, 71.3% robust).

Successful adoption of deep learning (DL) in the wild requires models to be: (1) compact, (2) accurate, and (3) robust to distributional shifts. Unfortunately, efforts towards simultaneously meeting these requirements have mostly been unsuccessful. This raises an important question: Is the inability to create Compact, Accurate, and Robust Deep neural networks (CARDs) fundamental? To answer this question, we perform a large-scale analysis of popular model compression techniques which uncovers several intriguing patterns. Notably, in contrast to traditional pruning approaches (e.g., fine tuning and gradual magnitude pruning), we find that "lottery ticket-style" approaches can surprisingly be used to produce CARDs, including binary-weight CARDs. Specifically, we are able to create extremely compact CARDs that, compared to their larger counterparts, have similar test accuracy and matching (or better) robustness -- simply by pruning and (optionally) quantizing. Leveraging the compactness of CARDs, we develop a simple domain-adaptive test-time ensembling approach (CARD-Decks) that uses a gating module to dynamically select appropriate CARDs from the CARD-Deck based on their spectral-similarity with test samples. The proposed approach builds a "winning hand'' of CARDs that establishes a new state-of-the-art (on RobustBench) on CIFAR-10-C accuracies (i.e., 96.8% standard and 92.75% robust) and CIFAR-100-C accuracies (80.6% standard and 71.3% robust) with better memory usage than non-compressed baselines (pretrained CARDs and CARD-Decks available at https://github.com/RobustBench/robustbench). Finally, we provide theoretical support for our empirical findings.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes