LGCVIVJun 7, 2022

Neural Network Compression via Effective Filter Analysis and Hierarchical Pruning

arXiv:2206.03596v12 citationsh-index: 49
Originality Incremental advance
AI Analysis

This addresses efficiency and performance issues in deep networks for deployment on low-end hardware, representing an incremental improvement over current pruning techniques.

The study tackled network compression by proposing a method to estimate maximum redundancy and a hierarchical pruning algorithm, achieving state-of-the-art accuracy at similar compression ratios compared to existing methods.

Network compression is crucial to making the deep networks to be more efficient, faster, and generalizable to low-end hardware. Current network compression methods have two open problems: first, there lacks a theoretical framework to estimate the maximum compression rate; second, some layers may get over-prunned, resulting in significant network performance drop. To solve these two problems, this study propose a gradient-matrix singularity analysis-based method to estimate the maximum network redundancy. Guided by that maximum rate, a novel and efficient hierarchical network pruning algorithm is developed to maximally condense the neuronal network structure without sacrificing network performance. Substantial experiments are performed to demonstrate the efficacy of the new method for pruning several advanced convolutional neural network (CNN) architectures. Compared to existing pruning methods, the proposed pruning algorithm achieved state-of-the-art performance. At the same or similar compression ratio, the new method provided the highest network prediction accuracy as compared to other methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes