Receding Neuron Importances for Structured Pruning
This work addresses the challenge of efficiently compressing neural networks for deployment, though it appears incremental as it builds on existing BatchNorm-based pruning methods.
The paper tackles the problem of structured pruning in neural networks by introducing a novel regularization method that suppresses only low-importance neurons, leading to a polarized bimodal distribution of importances. It shows that networks trained this way can be pruned more extensively with less deterioration, significantly outperforming existing approaches for VGG-style networks under severe pruning regimes.
Structured pruning efficiently compresses networks by identifying and removing unimportant neurons. While this can be elegantly achieved by applying sparsity-inducing regularisation on BatchNorm parameters, an L1 penalty would shrink all scaling factors rather than just those of superfluous neurons. To tackle this issue, we introduce a simple BatchNorm variation with bounded scaling parameters, based on which we design a novel regularisation term that suppresses only neurons with low importance. Under our method, the weights of unnecessary neurons effectively recede, producing a polarised bimodal distribution of importances. We show that neural networks trained this way can be pruned to a larger extent and with less deterioration. We one-shot prune VGG and ResNet architectures at different ratios on CIFAR and ImagenNet datasets. In the case of VGG-style networks, our method significantly outperforms existing approaches particularly under a severe pruning regime.