GNNAutoScale: Scalable and Expressive Graph Neural Networks via Historical Embeddings
This addresses scalability issues for researchers and practitioners using GNNs on large graphs, offering a novel solution that maintains expressive power without incremental trade-offs.
The authors tackled the problem of scaling graph neural networks (GNNs) to large graphs by developing GNNAutoScale, a framework that uses historical embeddings to prune computation, achieving constant GPU memory usage without data loss and closely matching the performance of non-scaling GNNs with state-of-the-art results.
We present GNNAutoScale (GAS), a framework for scaling arbitrary message-passing GNNs to large graphs. GAS prunes entire sub-trees of the computation graph by utilizing historical embeddings from prior training iterations, leading to constant GPU memory consumption in respect to input node size without dropping any data. While existing solutions weaken the expressive power of message passing due to sub-sampling of edges or non-trainable propagations, our approach is provably able to maintain the expressive power of the original GNN. We achieve this by providing approximation error bounds of historical embeddings and show how to tighten them in practice. Empirically, we show that the practical realization of our framework, PyGAS, an easy-to-use extension for PyTorch Geometric, is both fast and memory-efficient, learns expressive node representations, closely resembles the performance of their non-scaling counterparts, and reaches state-of-the-art performance on large-scale graphs.