LGCVMLMar 24, 2019

MUSCO: Multi-Stage Compression of neural networks

arXiv:1903.09973v411 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of reducing model size and computational cost for deep learning practitioners, but it is incremental as it builds on existing low-rank tensor approximation techniques.

The paper tackles neural network compression by proposing MUSCO, a multi-stage iterative method that alternates low-rank factorization with smart rank selection and fine-tuning, improving compression rates while maintaining accuracy across various tasks.

The low-rank tensor approximation is very promising for the compression of deep neural networks. We propose a new simple and efficient iterative approach, which alternates low-rank factorization with a smart rank selection and fine-tuning. We demonstrate the efficiency of our method comparing to non-iterative ones. Our approach improves the compression rate while maintaining the accuracy for a variety of tasks.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes