LGMLNov 27, 2018

Efficient non-uniform quantizer for quantized neural network targeting reconfigurable hardware

arXiv:1811.10869v12 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the need for power-efficient neural network inference on mobile devices by optimizing quantization for FPGA accelerators, though it appears incremental as it builds on existing quantization methods.

The authors tackled the problem of implementing efficient non-uniform quantization for neural networks on reconfigurable hardware like FPGAs, introducing a hardware-friendly approach that uses a single scale integer representation, and achieved minimal accuracy degradation on CIFAR-10 and CIFAR-100 datasets with ResNet-18 and VGG-like architectures.

Convolutional Neural Networks (CNN) has become more popular choice for various tasks such as computer vision, speech recognition and natural language processing. Thanks to their large computational capability and throughput, GPUs ,which are not power efficient and therefore does not suit low power systems such as mobile devices, are the most common platform for both training and inferencing tasks. Recent studies has shown that FPGAs can provide a good alternative to GPUs as a CNN accelerator, due to their re-configurable nature, low power and small latency. In order for FPGA-based accelerators outperform GPUs in inference task, both the parameters of the network and the activations must be quantized. While most works use uniform quantizers for both parameters and activations, it is not always the optimal one, and a non-uniform quantizer need to be considered. In this work we introduce a custom hardware-friendly approach to implement non-uniform quantizers. In addition, we use a single scale integer representation of both parameters and activations, for both training and inference. The combined method yields a hardware efficient non-uniform quantizer, fit for real-time applications. We have tested our method on CIFAR-10 and CIFAR-100 image classification datasets with ResNet-18 and VGG-like architectures, and saw little degradation in accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes