LGDCNov 22, 2023

Comprehensive Evaluation of GNN Training Systems: A Data Management Perspective

arXiv:2311.13279v220 citationsh-index: 10
Originality Synthesis-oriented
AI Analysis

This work addresses data management inefficiencies in GNN training for researchers and developers, but it is incremental as it reviews and evaluates existing approaches rather than introducing new methods.

The paper reviews Graph Neural Network (GNN) training systems from a data management perspective, analyzing challenges like data partitioning and batch preparation, and provides experimental results on benchmark datasets with practical tips for future system design.

Many Graph Neural Network (GNN) training systems have emerged recently to support efficient GNN training. Since GNNs embody complex data dependencies between training samples, the training of GNNs should address distinct challenges different from DNN training in data management, such as data partitioning, batch preparation for mini-batch training, and data transferring between CPUs and GPUs. These factors, which take up a large proportion of training time, make data management in GNN training more significant. This paper reviews GNN training from a data management perspective and provides a comprehensive analysis and evaluation of the representative approaches. We conduct extensive experiments on various benchmark datasets and show many interesting and valuable results. We also provide some practical tips learned from these experiments, which are helpful for designing GNN training systems in the future.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes