Pruning with Compensation: Efficient Channel Pruning for Deep Convolutional Neural Networks
This work addresses the computational and data inefficiency of channel pruning for researchers and practitioners in deep learning, offering a more efficient alternative to re-training-based methods.
The paper tackled the inefficiency of channel pruning in deep convolutional neural networks by proposing a pruning compensation method and compensation-aware pruning algorithm, resulting in a 95% reduction in processing time and 90% reduction in data usage while maintaining competitive performance on benchmarks like CIFAR-10/100 and ImageNet.
Channel pruning is a promising technique to compress the parameters of deep convolutional neural networks(DCNN) and to speed up the inference. This paper aims to address the long-standing inefficiency of channel pruning. Most channel pruning methods recover the prediction accuracy by re-training the pruned model from the remaining parameters or random initialization. This re-training process is heavily dependent on the sufficiency of computational resources, training data, and human interference(tuning the training strategy). In this paper, a highly efficient pruning method is proposed to significantly reduce the cost of pruning DCNN. The main contributions of our method include: 1) pruning compensation, a fast and data-efficient substitute of re-training to minimize the post-pruning reconstruction loss of features, 2) compensation-aware pruning(CaP), a novel pruning algorithm to remove redundant or less-weighted channels by minimizing the loss of information, and 3) binary structural search with step constraint to minimize human interference. On benchmarks including CIFAR-10/100 and ImageNet, our method shows competitive pruning performance among the state-of-the-art retraining-based pruning methods and, more importantly, reduces the processing time by 95% and data usage by 90%.