The rise of the lottery heroes: why zero-shot pruning is hard
This work addresses a bottleneck in applying pruning techniques for deep learning, potentially benefiting practitioners by making model optimization more accessible, though it appears incremental in nature.
The paper tackles the problem of efficiently identifying trainable sub-networks in deep neural networks during training, proposing an approach that reduces computational effort while exploring a trade-off between accuracy and training complexity.
Recent advances in deep learning optimization showed that just a subset of parameters are really necessary to successfully train a model. Potentially, such a discovery has broad impact from the theory to application; however, it is known that finding these trainable sub-network is a typically costly process. This inhibits practical applications: can the learned sub-graph structures in deep learning models be found at training time? In this work we explore such a possibility, observing and motivating why common approaches typically fail in the extreme scenarios of interest, and proposing an approach which potentially enables training with reduced computational effort. The experiments on either challenging architectures and datasets suggest the algorithmic accessibility over such a computational gain, and in particular a trade-off between accuracy achieved and training complexity deployed emerges.