LGDec 19, 2016

Quantization and Training of Low Bit-Width Convolutional Neural Networks for Object Detection

Penghang Yin, Shuai Zhang, Yingyong Qi, Jack Xin

arXiv:1612.06052v210.644 citations

Originality Incremental advance

AI Analysis

This work addresses efficiency challenges in deploying CNNs for object detection, offering incremental improvements in quantization methods.

The authors tackled the problem of reducing memory and computational costs in convolutional neural networks for object detection by developing LBW-Net, a method for quantizing weights to low bit-widths, resulting in nearly lossless performance on PASCAL VOC with over 4x faster deployment for 6-bit networks.

We present LBW-Net, an efficient optimization based method for quantization and training of the low bit-width convolutional neural networks (CNNs). Specifically, we quantize the weights to zero or powers of two by minimizing the Euclidean distance between full-precision weights and quantized weights during backpropagation. We characterize the combinatorial nature of the low bit-width quantization problem. For 2-bit (ternary) CNNs, the quantization of $N$ weights can be done by an exact formula in $O(N\log N)$ complexity. When the bit-width is three and above, we further propose a semi-analytical thresholding scheme with a single free parameter for quantization that is computationally inexpensive. The free parameter is further determined by network retraining and object detection tests. LBW-Net has several desirable advantages over full-precision CNNs, including considerable memory savings, energy efficiency, and faster deployment. Our experiments on PASCAL VOC dataset show that compared with its 32-bit floating-point counterpart, the performance of the 6-bit LBW-Net is nearly lossless in the object detection tasks, and can even do better in some real world visual scenes, while empirically enjoying more than 4$\times$ faster deployment.

View on arXiv PDF

Similar