LGCVMLJul 1, 2019

Weight Normalization based Quantization for Deep Neural Network Compression

arXiv:1907.00593v115 citations
Originality Incremental advance
AI Analysis

This addresses the need for efficient model deployment on mobile or embedded devices by improving quantization methods, though it appears incremental as it builds on existing quantization techniques.

The paper tackles the problem of high quantization error in deep neural network compression caused by long-tail weight distributions, proposing weight normalization based quantization (WNQ) which achieves state-of-the-art performance on CIFAR-100 and ImageNet benchmarks.

With the development of deep neural networks, the size of network models becomes larger and larger. Model compression has become an urgent need for deploying these network models to mobile or embedded devices. Model quantization is a representative model compression technique. Although a lot of quantization methods have been proposed, many of them suffer from a high quantization error caused by a long-tail distribution of network weights. In this paper, we propose a novel quantization method, called weight normalization based quantization (WNQ), for model compression. WNQ adopts weight normalization to avoid the long-tail distribution of network weights and subsequently reduces the quantization error. Experiments on CIFAR-100 and ImageNet show that WNQ can outperform other baselines to achieve state-of-the-art performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes