Rare Gems: Finding Lottery Tickets at Initialization
This addresses a key bottleneck in neural network pruning for researchers and practitioners, enabling faster and more effective sparse training without the need for time-consuming re-training cycles.
The paper tackles the problem of efficiently finding trainable sparse subnetworks (lottery tickets) at initialization, which previously failed to outperform simple baselines. The proposed Gem-Miner method resolves this by finding tickets that achieve accuracy competitive with or better than Iterative Magnitude Pruning, doing so up to 19 times faster.
Large neural networks can be pruned to a small fraction of their original size, with little loss in accuracy, by following a time-consuming "train, prune, re-train" approach. Frankle & Carbin conjecture that we can avoid this by training "lottery tickets", i.e., special sparse subnetworks found at initialization, that can be trained to high accuracy. However, a subsequent line of work by Frankle et al. and Su et al. presents concrete evidence that current algorithms for finding trainable networks at initialization, fail simple baseline comparisons, e.g., against training random sparse subnetworks. Finding lottery tickets that train to better accuracy compared to simple baselines remains an open problem. In this work, we resolve this open problem by proposing Gem-Miner which finds lottery tickets at initialization that beat current baselines. Gem-Miner finds lottery tickets trainable to accuracy competitive or better than Iterative Magnitude Pruning (IMP), and does so up to $19\times$ faster.