MLCVLGOct 10, 2018

A Closer Look at Structured Pruning for Neural Network Compression

arXiv:1810.04622v331 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses the problem of efficient neural network compression for practitioners, revealing that structured pruning may be less effective than previously assumed, offering an incremental improvement by proposing a simple alternative approach.

The paper investigates structured pruning for neural network compression and finds that smaller networks trained from scratch outperform pruned networks, with pruned architectures retrained from scratch being more competitive and scalable, leading to significantly faster inference speeds.

Structured pruning is a popular method for compressing a neural network: given a large trained network, one alternates between removing channel connections and fine-tuning; reducing the overall width of the network. However, the efficacy of structured pruning has largely evaded scrutiny. In this paper, we examine ResNets and DenseNets obtained through structured pruning-and-tuning and make two interesting observations: (i) reduced networks---smaller versions of the original network trained from scratch---consistently outperform pruned networks; (ii) if one takes the architecture of a pruned network and then trains it from scratch it is significantly more competitive. Furthermore, these architectures are easy to approximate: we can prune once and obtain a family of new, scalable network architectures that can simply be trained from scratch. Finally, we compare the inference speed of reduced and pruned networks on hardware, and show that reduced networks are significantly faster. Code is available at https://github.com/BayesWatch/pytorch-prunes.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes