DCAILGSep 10, 2024

D3-GNN: Dynamic Distributed Dataflow for Streaming Graph Neural Networks

arXiv:2409.09079v15 citationsh-index: 7
Originality Incremental advance
AI Analysis

This addresses the problem of handling dynamic graph updates for real-time applications, representing an incremental advance in systems optimization for streaming GNNs.

The paper tackles the challenge of efficiently processing streaming graph neural networks (GNNs) in real-time by introducing D3-GNN, a distributed system that achieves a 76x throughput improvement over DGL and reduces running times by 10x with windowed enhancements.

Graph Neural Network (GNN) models on streaming graphs entail algorithmic challenges to continuously capture its dynamic state, as well as systems challenges to optimize latency, memory, and throughput during both inference and training. We present D3-GNN, the first distributed, hybrid-parallel, streaming GNN system designed to handle real-time graph updates under online query setting. Our system addresses data management, algorithmic, and systems challenges, enabling continuous capturing of the dynamic state of the graph and updating node representations with fault-tolerance and optimal latency, load-balance, and throughput. D3-GNN utilizes streaming GNN aggregators and an unrolled, distributed computation graph architecture to handle cascading graph updates. To counteract data skew and neighborhood explosion issues, we introduce inter-layer and intra-layer windowed forward pass solutions. Experiments on large-scale graph streams demonstrate that D3-GNN achieves high efficiency and scalability. Compared to DGL, D3-GNN achieves a significant throughput improvement of about 76x for streaming workloads. The windowed enhancement further reduces running times by around 10x and message volumes by up to 15x at higher parallelism.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes