Towards Generalized Entropic Sparsification for Convolutional Neural Networks
This provides a computationally scalable pruning method for reducing model size in CNNs, which is incremental as it builds on existing pruning techniques.
The paper tackles the problem of overparametrization in convolutional neural networks by introducing a layer-by-layer data-driven pruning method based on network entropy minimization, achieving sparsity levels of 55%-89% with accuracy losses of only 0.1%-0.5% on benchmarks like MNIST and CIFAR-10.
Convolutional neural networks (CNNs) are reported to be overparametrized. The search for optimal (minimal) and sufficient architecture is an NP-hard problem as the hyperparameter space for possible network configurations is vast. Here, we introduce a layer-by-layer data-driven pruning method based on the mathematical idea aiming at a computationally-scalable entropic relaxation of the pruning problem. The sparse subnetwork is found from the pre-trained (full) CNN using the network entropy minimization as a sparsity constraint. This allows deploying a numerically scalable algorithm with a sublinear scaling cost. The method is validated on several benchmarks (architectures): (i) MNIST (LeNet) with sparsity 55%-84% and loss in accuracy 0.1%-0.5%, and (ii) CIFAR-10 (VGG-16, ResNet18) with sparsity 73-89% and loss in accuracy 0.1%-0.5%.