CVApr 11, 2016

Hardware-oriented Approximation of Convolutional Neural Networks

arXiv:1604.03168v3324 citations
Originality Incremental advance
AI Analysis

This work addresses efficiency challenges for deploying CNNs on resource-constrained hardware, representing an incremental improvement in model compression techniques.

The paper tackles the high computational complexity of Convolutional Neural Networks (CNNs) for mobile devices by introducing Ristretto, a hardware-oriented approximation framework that condenses models like CaffeNet and SqueezeNet to 8-bit fixed-point arithmetic with a maximum error tolerance of 1%.

High computational complexity hinders the widespread usage of Convolutional Neural Networks (CNNs), especially in mobile devices. Hardware accelerators are arguably the most promising approach for reducing both execution time and power consumption. One of the most important steps in accelerator development is hardware-oriented model approximation. In this paper we present Ristretto, a model approximation framework that analyzes a given CNN with respect to numerical resolution used in representing weights and outputs of convolutional and fully connected layers. Ristretto can condense models by using fixed point arithmetic and representation instead of floating point. Moreover, Ristretto fine-tunes the resulting fixed point network. Given a maximum error tolerance of 1%, Ristretto can successfully condense CaffeNet and SqueezeNet to 8-bit. The code for Ristretto is available.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes