LGMLOct 3, 2018

Relaxed Quantization for Discretized Neural Networks

arXiv:1810.01875v1144 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of deploying large neural networks on devices with limited resources, representing an incremental improvement in quantization techniques.

The paper tackles the problem of neural network quantization for deployment on resource-constrained devices by introducing a differentiable quantization procedure that transforms continuous distributions to categorical ones over a quantization grid, which can be optimized with gradient descent. The method was experimentally validated on MNIST, CIFAR-10, and ImageNet classification, though no concrete performance numbers are provided in the abstract.

Neural network quantization has become an important research area due to its great impact on deployment of large models on resource constrained devices. In order to train networks that can be effectively discretized without loss of performance, we introduce a differentiable quantization procedure. Differentiability can be achieved by transforming continuous distributions over the weights and activations of the network to categorical distributions over the quantization grid. These are subsequently relaxed to continuous surrogates that can allow for efficient gradient-based optimization. We further show that stochastic rounding can be seen as a special case of the proposed approach and that under this formulation the quantization grid itself can also be optimized with gradient descent. We experimentally validate the performance of our method on MNIST, CIFAR 10 and Imagenet classification.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes