Convolutional Neural Network Pruning with Structural Redundancy Reduction
This work addresses network compression for efficient deep learning deployment, but it appears incremental as it builds on existing pruning approaches.
The paper tackled the problem of convolutional neural network pruning by identifying structural redundancy rather than removing unimportant filters, and experiments showed it significantly outperformed previous state-of-the-art methods.
Convolutional neural network (CNN) pruning has become one of the most successful network compression approaches in recent years. Existing works on network pruning usually focus on removing the least important filters in the network to achieve compact architectures. In this study, we claim that identifying structural redundancy plays a more essential role than finding unimportant filters, theoretically and empirically. We first statistically model the network pruning problem in a redundancy reduction perspective and find that pruning in the layer(s) with the most structural redundancy outperforms pruning the least important filters across all layers. Based on this finding, we then propose a network pruning approach that identifies structural redundancy of a CNN and prunes filters in the selected layer(s) with the most redundancy. Experiments on various benchmark network architectures and datasets show that our proposed approach significantly outperforms the previous state-of-the-art.