Fast Track to Winning Tickets: Repowering One-Shot Pruning for Graph Neural Networks
This work addresses the computational bottleneck for applying GNNs to large-scale graphs, offering a faster alternative to existing pruning methods, though it is incremental as it builds on the Graph Lottery Hypothesis.
The paper tackles the computational inefficiency of identifying Graph Lottery Tickets (GLTs) in Graph Neural Networks (GNNs) by proposing a one-shot pruning and denoising framework, which achieves higher sparsity (1.32%-45.62% weight sparsity improvement, 7.49%-22.71% graph sparsity increase) and faster speeds (1.7-44x speedup, 95.3%-98.6% MAC savings) compared to iterative methods.
Graph Neural Networks (GNNs) demonstrate superior performance in various graph learning tasks, yet their wider real-world application is hindered by the computational overhead when applied to large-scale graphs. To address the issue, the Graph Lottery Hypothesis (GLT) has been proposed, advocating the identification of subgraphs and subnetworks, \textit{i.e.}, winning tickets, without compromising performance. The effectiveness of current GLT methods largely stems from the use of iterative magnitude pruning (IMP), which offers higher stability and better performance than one-shot pruning. However, identifying GLTs is highly computationally expensive, due to the iterative pruning and retraining required by IMP. In this paper, we reevaluate the correlation between one-shot pruning and IMP: while one-shot tickets are suboptimal compared to IMP, they offer a \textit{fast track} to tickets with a stronger performance. We introduce a one-shot pruning and denoising framework to validate the efficacy of the \textit{fast track}. Compared to current IMP-based GLT methods, our framework achieves a double-win situation of graph lottery tickets with \textbf{higher sparsity} and \textbf{faster speeds}. Through extensive experiments across 4 backbones and 6 datasets, our method demonstrates $1.32\% - 45.62\%$ improvement in weight sparsity and a $7.49\% - 22.71\%$ increase in graph sparsity, along with a $1.7-44 \times$ speedup over IMP-based methods and $95.3\%-98.6\%$ MAC savings.