Graph Neural Networks on Graph Databases
This work addresses scaling issues in GNN training for researchers and practitioners, offering a new approach that leverages graph databases, though it appears incremental as it builds on existing methods.
The paper tackles the challenge of training graph neural networks on large datasets by directly training them on graph databases, retrieving minimal data into memory and sampling using the query engine, resulting in resource advantages for single-machine and distributed training.
Training graph neural networks on large datasets has long been a challenge. Traditional approaches include efficiently representing the whole graph in-memory, designing parameter efficient and sampling-based models, and graph partitioning in a distributed setup. Separately, graph databases with native graph storage and query engines have been developed, which enable time and resource efficient graph analytics workloads. We show how to directly train a GNN on a graph DB, by retrieving minimal data into memory and sampling using the query engine. Our experiments show resource advantages for single-machine and distributed training. Our approach opens up a new way of scaling GNNs as well as a new application area for graph DBs.