DC AI LGJul 15, 2025

PGT-I: Scaling Spatiotemporal GNNs with Memory-Efficient Distributed Training

Seth Ockerman, Amal Gueroudji, Tanwi Mallick, Yixuan He, Line Pouchard, Robert Ross, Shivaram Venkataraman

arXiv:2507.11683v33.33 citationsh-index: 6Has CodeSC

Originality Highly original

AI Analysis

This addresses a scalability bottleneck for researchers and practitioners working with large-scale spatiotemporal data, representing a strong specific gain rather than a foundational breakthrough.

The paper tackled the memory constraints limiting spatiotemporal graph neural networks (ST-GNNs) to small-scale datasets by introducing PGT-I, a distributed training framework that reduces peak memory usage by up to 89% and achieves up to an 11.78x speedup over standard methods, enabling training on the entire PeMS dataset without partitioning.

Spatiotemporal graph neural networks (ST-GNNs) are powerful tools for modeling spatial and temporal data dependencies. However, their applications have been limited primarily to small-scale datasets because of memory constraints. While distributed training offers a solution, current frameworks lack support for spatiotemporal models and overlook the properties of spatiotemporal data. Informed by a scaling study on a large-scale workload, we present PyTorch Geometric Temporal Index (PGT-I), an extension to PyTorch Geometric Temporal that integrates distributed data parallel training and two novel strategies: index-batching and distributed-index-batching. Our index techniques exploit spatiotemporal structure to construct snapshots dynamically at runtime, significantly reducing memory overhead, while distributed-index-batching extends this approach by enabling scalable processing across multiple GPUs. Our techniques enable the first-ever training of an ST-GNN on the entire PeMS dataset without graph partitioning, reducing peak memory usage by up to 89% and achieving up to a 11.78x speedup over standard DDP with 128 GPUs.

View on arXiv PDF Code

Similar