CVLGNov 20, 2015

Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications

arXiv:1511.06530v2944 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of deploying complex CNNs on mobile devices for applications requiring fast and low-power operation, representing an incremental improvement in compression techniques.

The paper tackles the challenge of running deep convolutional neural networks on mobile devices by proposing a one-shot whole network compression scheme, achieving significant reductions in model size, runtime, and energy consumption with only a small loss in accuracy.

Although the latest high-end smartphone has powerful CPU and GPU, running deeper convolutional neural networks (CNNs) for complex tasks such as ImageNet classification on mobile devices is challenging. To deploy deep CNNs on mobile devices, we present a simple and effective scheme to compress the entire CNN, which we call one-shot whole network compression. The proposed scheme consists of three steps: (1) rank selection with variational Bayesian matrix factorization, (2) Tucker decomposition on kernel tensor, and (3) fine-tuning to recover accumulated loss of accuracy, and each step can be easily implemented using publicly available tools. We demonstrate the effectiveness of the proposed scheme by testing the performance of various compressed CNNs (AlexNet, VGGS, GoogLeNet, and VGG-16) on the smartphone. Significant reductions in model size, runtime, and energy consumption are obtained, at the cost of small loss in accuracy. In addition, we address the important implementation level issue on 1?1 convolution, which is a key operation of inception module of GoogLeNet as well as CNNs compressed by our proposed scheme.

Code Implementations7 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes