Group Sparsity: The Hinge Between Filter Pruning and Decomposition for Network Compression
This provides a more adaptable compression method for deep learning models, though it is incremental as it builds on existing techniques.
The paper tackles network compression by unifying filter pruning and low-rank decomposition through sparsity regularization, enabling flexible compression that addresses limitations in architectures like ResNet, and achieves competitive results on benchmarks.
In this paper, we analyze two popular network compression techniques, i.e. filter pruning and low-rank decomposition, in a unified sense. By simply changing the way the sparsity regularization is enforced, filter pruning and low-rank decomposition can be derived accordingly. This provides another flexible choice for network compression because the techniques complement each other. For example, in popular network architectures with shortcut connections (e.g. ResNet), filter pruning cannot deal with the last convolutional layer in a ResBlock while the low-rank decomposition methods can. In addition, we propose to compress the whole network jointly instead of in a layer-wise manner. Our approach proves its potential as it compares favorably to the state-of-the-art on several benchmarks.