LG DCNov 22, 2023

Comprehensive Evaluation of GNN Training Systems: A Data Management Perspective

Hao Yuan, Yajiong Liu, Yanfeng Zhang, Xin Ai, Qiange Wang, Chaoyi Chen, Yu Gu, Ge Yu

arXiv:2311.13279v213.720 citationsh-index: 10Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses data management inefficiencies in GNN training for researchers and developers, but it is incremental as it reviews and evaluates existing approaches rather than introducing new methods.

The paper reviews Graph Neural Network (GNN) training systems from a data management perspective, analyzing challenges like data partitioning and batch preparation, and provides experimental results on benchmark datasets with practical tips for future system design.

Many Graph Neural Network (GNN) training systems have emerged recently to support efficient GNN training. Since GNNs embody complex data dependencies between training samples, the training of GNNs should address distinct challenges different from DNN training in data management, such as data partitioning, batch preparation for mini-batch training, and data transferring between CPUs and GPUs. These factors, which take up a large proportion of training time, make data management in GNN training more significant. This paper reviews GNN training from a data management perspective and provides a comprehensive analysis and evaluation of the representative approaches. We conduct extensive experiments on various benchmark datasets and show many interesting and valuable results. We also provide some practical tips learned from these experiments, which are helpful for designing GNN training systems in the future.

View on arXiv PDF Code

Similar