LGCVMLJun 15, 2019

Scalable Model Compression by Entropy Penalized Reparameterization

arXiv:1906.06624v346 citations
Originality Incremental advance
AI Analysis

This provides a scalable and general method for compressing neural networks, beneficial for deployment in resource-constrained environments, though it is incremental as it builds on existing reparameterization and entropy-based techniques.

The authors tackled neural network weight compression by jointly optimizing classification accuracy and compressibility using an entropy penalty on a latent parameter representation, achieving state-of-the-art compression across multiple benchmarks like MNIST, CIFAR-10, and ImageNet with six architectures.

We describe a simple and general neural network weight compression approach, in which the network parameters (weights and biases) are represented in a "latent" space, amounting to a reparameterization. This space is equipped with a learned probability model, which is used to impose an entropy penalty on the parameter representation during training, and to compress the representation using a simple arithmetic coder after training. Classification accuracy and model compressibility is maximized jointly, with the bitrate--accuracy trade-off specified by a hyperparameter. We evaluate the method on the MNIST, CIFAR-10 and ImageNet classification benchmarks using six distinct model architectures. Our results show that state-of-the-art model compression can be achieved in a scalable and general way without requiring complex procedures such as multi-stage training.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes