LGJul 30, 2021

Model Preserving Compression for Neural Networks

Jerry Chee, Megan Renz, Anil Damle, Christopher De Sa

arXiv:2108.00065v28.416 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the need for efficient model deployment without compromising accuracy or robustness, offering an incremental improvement over existing compression methods.

The paper tackles the problem of compressing neural networks while preserving per-example decisions, network structure, and eliminating fine-tuning, achieving strong empirical performance across tasks like ImageNet.

After training complex deep learning models, a common task is to compress the model to reduce compute and storage demands. When compressing, it is desirable to preserve the original model's per-example decisions (e.g., to go beyond top-1 accuracy or preserve robustness), maintain the network's structure, automatically determine per-layer compression levels, and eliminate the need for fine tuning. No existing compression methods simultaneously satisfy these criteria $\unicode{x2014}$ we introduce a principled approach that does by leveraging interpolative decompositions. Our approach simultaneously selects and eliminates channels (analogously, neurons), then constructs an interpolation matrix that propagates a correction into the next layer, preserving the network's structure. Consequently, our method achieves good performance even without fine tuning and admits theoretical analysis. Our theoretical generalization bound for a one layer network lends itself naturally to a heuristic that allows our method to automatically choose per-layer sizes for deep networks. We demonstrate the efficacy of our approach with strong empirical performance on a variety of tasks, models, and datasets $\unicode{x2014}$ from simple one-hidden-layer networks to deep networks on ImageNet.

View on arXiv PDF Code

Similar