Analyzing the Performance of Graph Neural Networks with Pipe Parallelism
This work addresses the scaling limitations of Graph Neural Networks for researchers and practitioners working with large graph-structured datasets, offering an incremental approach to improve efficiency.
This paper explores the application of pipeline parallelism to Graph Neural Networks (GNNs) using Google's GPipe framework. The study aims to address the scaling limits of GNNs caused by memory and runtime bottlenecks from recursive calculations on dense graph relationships.
Many interesting datasets ubiquitous in machine learning and deep learning can be described via graphs. As the scale and complexity of graph-structured datasets increase, such as in expansive social networks, protein folding, chemical interaction networks, and material phase transitions, improving the efficiency of the machine learning techniques applied to these is crucial. In this study, we focus on Graph Neural Networks (GNN) that have found great success in tasks such as node or edge classification and link prediction. However, standard GNN models have scaling limits due to necessary recursive calculations performed through dense graph relationships that lead to memory and runtime bottlenecks. While new approaches for processing larger networks are needed to advance graph techniques, and several have been proposed, we study how GNNs could be parallelized using existing tools and frameworks that are known to be successful in the deep learning community. In particular, we investigate applying pipeline parallelism to GNN models with GPipe, introduced by Google in 2018.