LG CV NE MLSep 18, 2020

Pruning Neural Networks at Initialization: Why are We Missing the Mark?

Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, Michael Carbin

arXiv:2009.08576v233.9265 citations

Originality Synthesis-oriented

AI Analysis

This work highlights fundamental challenges in pruning neural networks before training, which is an incremental step for researchers in model compression and efficiency.

The paper assesses methods for pruning neural networks at initialization, finding that they underperform compared to pruning after training, and demonstrates that randomizing pruned weights within layers preserves accuracy, indicating issues with the pruning heuristics or the approach itself.

Recent work has explored the possibility of pruning neural networks at initialization. We assess proposals for doing so: SNIP (Lee et al., 2019), GraSP (Wang et al., 2020), SynFlow (Tanaka et al., 2020), and magnitude pruning. Although these methods surpass the trivial baseline of random pruning, they remain below the accuracy of magnitude pruning after training, and we endeavor to understand why. We show that, unlike pruning after training, randomly shuffling the weights these methods prune within each layer or sampling new initial values preserves or improves accuracy. As such, the per-weight pruning decisions made by these methods can be replaced by a per-layer choice of the fraction of weights to prune. This property suggests broader challenges with the underlying pruning heuristics, the desire to prune at initialization, or both.

View on arXiv PDF

Similar