CVNov 14, 2022

Pruning Very Deep Neural Network Channels for Efficient Inference

arXiv:2211.08339v11.42 citationsh-index: 13

Originality Incremental advance

AI Analysis

This addresses the need for efficient inference in deep learning models, particularly for resource-constrained applications, and is incremental as it builds on existing pruning techniques.

The paper tackles the problem of accelerating very deep convolutional neural networks by introducing a new channel pruning method, achieving a 5x speed-up with only a 0.3% error increase for VGG-16 and 2x speed-up with 1.4% and 1.0% accuracy loss for ResNet and Xception, respectively.

In this paper, we introduce a new channel pruning method to accelerate very deep convolutional neural networks. Given a trained CNN model, we propose an iterative two-step algorithm to effectively prune each layer, by a LASSO regression based channel selection and least square reconstruction. We further generalize this algorithm to multi-layer and multi-branch cases. Our method reduces the accumulated error and enhances the compatibility with various architectures. Our pruned VGG-16 achieves the state-of-the-art results by 5x speed-up along with only 0.3% increase of error. More importantly, our method is able to accelerate modern networks like ResNet, Xception and suffers only 1.4%, 1.0% accuracy loss under 2x speed-up respectively, which is significant. Our code has been made publicly available.

View on arXiv PDF

Similar