CV LGAug 2, 2021

Group Fisher Pruning for Practical Network Compression

Liyang Liu, Shilong Zhang, Zhanghui Kuang, Aojun Zhou, Jing-Hao Xue, Xinjiang Wang, Yimin Chen, Wenming Yang, Qingmin Liao, Wayne Zhang

arXiv:2108.00708v130.7210 citationsh-index: 43Has Code

Originality Incremental advance

AI Analysis

This work addresses the practical challenge of compressing sophisticated networks for faster inference in applications like image classification and object detection, representing an incremental improvement over prior methods.

The paper tackles the problem of pruning complex neural network structures with coupled channels, such as residual connections and group convolutions, by proposing a general channel pruning approach that automatically groups layers and uses Fisher information to evaluate channel importance, achieving effective network compression without accuracy loss.

Network compression has been widely studied since it is able to reduce the memory and computation cost during inference. However, previous methods seldom deal with complicated structures like residual connections, group/depth-wise convolution and feature pyramid network, where channels of multiple layers are coupled and need to be pruned simultaneously. In this paper, we present a general channel pruning approach that can be applied to various complicated structures. Particularly, we propose a layer grouping algorithm to find coupled channels automatically. Then we derive a unified metric based on Fisher information to evaluate the importance of a single channel and coupled channels. Moreover, we find that inference speedup on GPUs is more correlated with the reduction of memory rather than FLOPs, and thus we employ the memory reduction of each channel to normalize the importance. Our method can be used to prune any structures including those with coupled channels. We conduct extensive experiments on various backbones, including the classic ResNet and ResNeXt, mobile-friendly MobileNetV2, and the NAS-based RegNet, both on image classification and object detection which is under-explored. Experimental results validate that our method can effectively prune sophisticated networks, boosting inference speed without sacrificing accuracy.

View on arXiv PDF Code

Similar