CVLGMay 22, 2020

Position-based Scaled Gradient for Model Quantization and Pruning

arXiv:2005.11035v47 citations
AI Analysis

This work addresses model compression for efficient deployment in resource-constrained environments, presenting an incremental improvement through a novel gradient scaling technique.

The authors tackled the problem of making neural network weights more amenable to compression by proposing a position-based scaled gradient (PSG) that acts as a regularizer, reducing the gap between full-precision and compressed models, with experimental results on CIFAR-10/100 and ImageNet showing effectiveness in pruning and quantization even at extremely low bits.

We propose the position-based scaled gradient (PSG) that scales the gradient depending on the position of a weight vector to make it more compression-friendly. First, we theoretically show that applying PSG to the standard gradient descent (GD), which is called PSGD, is equivalent to the GD in the warped weight space, a space made by warping the original weight space via an appropriately designed invertible function. Second, we empirically show that PSG acting as a regularizer to a weight vector is favorable for model compression domains such as quantization and pruning. PSG reduces the gap between the weight distributions of a full-precision model and its compressed counterpart. This enables the versatile deployment of a model either as an uncompressed mode or as a compressed mode depending on the availability of resources. The experimental results on CIFAR-10/100 and ImageNet datasets show the effectiveness of the proposed PSG in both domains of pruning and quantization even for extremely low bits. The code is released in Github.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes