CVJul 19, 2017

Channel Pruning for Accelerating Very Deep Neural Networks

arXiv:1707.06168v22760 citations
AI Analysis

This work addresses the need for faster inference in deep learning models, particularly for resource-constrained applications, by providing a method that is effective across various architectures, though it is incremental in improving existing pruning techniques.

The paper tackles the problem of accelerating very deep neural networks by introducing a new channel pruning method, achieving a 5x speed-up with only a 0.3% error increase for VGG-16 and 2x speed-up with 1.4% and 1.0% accuracy loss for ResNet and Xception, respectively.

In this paper, we introduce a new channel pruning method to accelerate very deep convolutional neural networks.Given a trained CNN model, we propose an iterative two-step algorithm to effectively prune each layer, by a LASSO regression based channel selection and least square reconstruction. We further generalize this algorithm to multi-layer and multi-branch cases. Our method reduces the accumulated error and enhance the compatibility with various architectures. Our pruned VGG-16 achieves the state-of-the-art results by 5x speed-up along with only 0.3% increase of error. More importantly, our method is able to accelerate modern networks like ResNet, Xception and suffers only 1.4%, 1.0% accuracy loss under 2x speed-up respectively, which is significant. Code has been made publicly available.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes