Advanced deep architecture pruning using single filter performance
This work addresses the need for more efficient neural networks for applications requiring lower energy consumption and latency, though it appears incremental as it builds on existing pruning concepts with a new statistical mechanics-inspired approach.
The paper tackled the problem of reducing computational complexity in neural networks by introducing a pruning method based on single filter performance, achieving high pruning rates without accuracy loss on VGG-11 and EfficientNet-B0 architectures trained on CIFAR-100, outperforming other techniques at the same pruning magnitude.
Pruning the parameters and structure of neural networks reduces the computational complexity, energy consumption, and latency during inference. Recently, a novel underlying mechanism for successful deep learning (DL) was presented based on a method that quantitatively measures the single filter performance in each layer of a DL architecture, and a new comprehensive mechanism of how deep learning works was presented. This statistical mechanics inspired viewpoint enables to reveal the macroscopic behavior of the entire network from the microscopic performance of each filter and their cooperative behavior. Herein, we demonstrate how this understanding paves the path to high quenched dilution of the convolutional layers of deep architectures without affecting their overall accuracy using applied filter cluster connections (AFCC). AFCC is exemplified on VGG-11 and EfficientNet-B0 architectures trained on CIFAR-100, and its high pruning outperforms other techniques using the same pruning magnitude. Additionally, this technique is broadened to single nodal performance and highly pruning of fully connected layers, suggesting a possible implementation to considerably reduce the complexity of over-parameterized AI tasks.