LGJul 28, 2022

CrAM: A Compression-Aware Minimizer

arXiv:2207.14200v411 citationsh-index: 41Has Code
Originality Highly original
AI Analysis

This addresses the need for efficient deployment of DNNs in practical settings by enabling one-shot compression, which is incremental as it builds on existing compression methods but offers improved stability and performance.

The paper tackles the problem of compressing deep neural networks post-training without significant accuracy loss by proposing CrAM, a compression-aware minimizer that modifies optimization to produce models stable under pruning. Experimental results show that CrAM-trained models can be pruned to 70-80% sparsity with almost no accuracy loss and to 90% sparsity with about 1% accuracy loss, outperforming standard baselines.

Deep neural networks (DNNs) often have to be compressed, via pruning and/or quantization, before they can be deployed in practical settings. In this work we propose a new compression-aware minimizer dubbed CrAM that modifies the optimization step in a principled way, in order to produce models whose local loss behavior is stable under compression operations such as pruning. Thus, dense models trained via CrAM should be compressible post-training, in a single step, without significant accuracy loss. Experimental results on standard benchmarks, such as residual networks for ImageNet classification and BERT models for language modelling, show that CrAM produces dense models that can be more accurate than the standard SGD/Adam-based baselines, but which are stable under weight pruning: specifically, we can prune models in one-shot to 70-80% sparsity with almost no accuracy loss, and to 90% with reasonable ($\sim 1\%$) accuracy loss, which is competitive with gradual compression methods. Additionally, CrAM can produce sparse models which perform well for transfer learning, and it also works for semi-structured 2:4 pruning patterns supported by GPU hardware. The code for reproducing the results is available at https://github.com/IST-DASLab/CrAM .

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes