LGJun 21, 2022

Winning the Lottery Ahead of Time: Efficient Early Network Pruning

John Rachwan, Daniel Zügner, Bertrand Charpentier, Simon Geisler, Morgane Ayle, Stephan Günnemann

arXiv:2206.10451v118.134 citationsh-index: 23Has Code

Originality Incremental advance

AI Analysis

This addresses the problem of high computational costs and hardware requirements for pruning in deep learning, offering a more efficient solution, though it appears incremental as it builds on existing pruning paradigms.

The paper tackles the inefficiency and hardware limitations of existing neural network pruning methods by proposing EarlyCroP, which efficiently extracts state-of-the-art sparse models early in training, enabling training on commodity GPUs and outperforming baselines across multiple tasks and domains with accuracy comparable to dense training.

Pruning, the task of sparsifying deep neural networks, received increasing attention recently. Although state-of-the-art pruning methods extract highly sparse models, they neglect two main challenges: (1) the process of finding these sparse models is often very expensive; (2) unstructured pruning does not provide benefits in terms of GPU memory, training time, or carbon emissions. We propose Early Compression via Gradient Flow Preservation (EarlyCroP), which efficiently extracts state-of-the-art sparse models before or early in training addressing challenge (1), and can be applied in a structured manner addressing challenge (2). This enables us to train sparse networks on commodity GPUs whose dense versions would be too large, thereby saving costs and reducing hardware requirements. We empirically show that EarlyCroP outperforms a rich set of baselines for many tasks (incl. classification, regression) and domains (incl. computer vision, natural language processing, and reinforcment learning). EarlyCroP leads to accuracy comparable to dense training while outperforming pruning baselines.

View on arXiv PDF Code

Similar