SpanGNN: Towards Memory-Efficient Graph Neural Networks via Spanning Subgraph Training
This addresses memory constraints in GNN training for large graphs, offering a practical solution for researchers and practitioners, though it is incremental as it builds on mini-batch approaches.
The paper tackles the memory inefficiency of full-graph GNN training by proposing SpanGNN, a method that trains on spanning subgraphs to reduce peak memory usage while maintaining accuracy, achieving competitive performance with lower memory consumption on standard datasets.
Graph Neural Networks (GNNs) have superior capability in learning graph data. Full-graph GNN training generally has high accuracy, however, it suffers from large peak memory usage and encounters the Out-of-Memory problem when handling large graphs. To address this memory problem, a popular solution is mini-batch GNN training. However, mini-batch GNN training increases the training variance and sacrifices the model accuracy. In this paper, we propose a new memory-efficient GNN training method using spanning subgraph, called SpanGNN. SpanGNN trains GNN models over a sequence of spanning subgraphs, which are constructed from empty structure. To overcome the excessive peak memory consumption problem, SpanGNN selects a set of edges from the original graph to incrementally update the spanning subgraph between every epoch. To ensure the model accuracy, we introduce two types of edge sampling strategies (i.e., variance-reduced and noise-reduced), and help SpanGNN select high-quality edges for the GNN learning. We conduct experiments with SpanGNN on widely used datasets, demonstrating SpanGNN's advantages in the model performance and low peak memory usage.