LGAIFeb 6, 2025

Synthetic Datasets for Machine Learning on Spatio-Temporal Graphs using PDEs

arXiv:2502.04140v2h-index: 3Has CodeJ. Data-centric Mach. Learn. Res.
Originality Synthesis-oriented
AI Analysis

This work addresses data scarcity for researchers in temporal graph machine learning, particularly for modeling physical processes, but it is incremental as it extends existing PDE-based methods to new domains.

The authors tackled the scarcity of datasets for spatio-temporal graph machine learning by creating synthetic datasets based on partial differential equations (PDEs) to model disasters like epidemics, atmospheric particles, and tsunamis, and demonstrated that pre-training on these datasets improves model performance on real-world epidemiological data.

Many physical processes can be expressed through partial differential equations (PDEs). Real-world measurements of such processes are often collected at irregularly distributed points in space, which can be effectively represented as graphs; however, there are currently only a few existing datasets. Our work aims to make advancements in the field of PDE-modeling accessible to the temporal graph machine learning community, while addressing the data scarcity problem, by creating and utilizing datasets based on PDEs. In this work, we create and use synthetic datasets based on PDEs to support spatio-temporal graph modeling in machine learning for different applications. More precisely, we showcase three equations to model different types of disasters and hazards in the fields of epidemiology, atmospheric particles, and tsunami waves. Further, we show how such created datasets can be used by benchmarking several machine learning models on the epidemiological dataset. Additionally, we show how pre-training on this dataset can improve model performance on real-world epidemiological data. The presented methods enable others to create datasets and benchmarks customized to individual requirements. The source code for our methodology and the three created datasets can be found on https://github.com/github-usr-ano/Temporal_Graph_Data_PDEs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes