ReInc: Scaling Training of Dynamic Graph Neural Networks
This addresses the scalability challenge for researchers and practitioners applying DGNNs in domains like traffic prediction and social network analysis, representing a strong specific gain rather than a foundational breakthrough.
The paper tackles the problem of inefficient and unscalable training of Dynamic Graph Neural Networks (DGNNs) on large-scale graphs, and the result is that their system, ReInc, achieves up to an order of magnitude speedup compared to state-of-the-art frameworks.
Dynamic Graph Neural Networks (DGNNs) have gained widespread attention due to their applicability in diverse domains such as traffic network prediction, epidemiological forecasting, and social network analysis. In this paper, we present ReInc, a system designed to enable efficient and scalable training of DGNNs on large-scale graphs. ReInc introduces key innovations that capitalize on the unique combination of Graph Neural Networks (GNNs) and Recurrent Neural Networks (RNNs) inherent in DGNNs. By reusing intermediate results and incrementally computing aggregations across consecutive graph snapshots, ReInc significantly enhances computational efficiency. To support these optimizations, ReInc incorporates a novel two-level caching mechanism with a specialized caching policy aligned to the DGNN execution workflow. Additionally, ReInc addresses the challenges of managing structural and temporal dependencies in dynamic graphs through a new distributed training strategy. This approach eliminates communication overheads associated with accessing remote features and redistributing intermediate results. Experimental results demonstrate that ReInc achieves up to an order of magnitude speedup compared to state-of-the-art frameworks, tested across various dynamic GNN architectures and real-world graph datasets.