Multi-Modality Spatio-Temporal Forecasting via Self-Supervised Learning
This work addresses forecasting problems in monitoring systems like traffic and air quality, but it appears incremental as it builds on existing spatio-temporal modeling with a focus on multi-modality integration.
The paper tackles the challenge of multi-modality spatio-temporal forecasting by proposing a self-supervised learning framework to uncover latent patterns and quantify dynamic heterogeneity, achieving superior performance compared to state-of-the-art baselines on two real-world datasets.
Multi-modality spatio-temporal (MoST) data extends spatio-temporal (ST) data by incorporating multiple modalities, which is prevalent in monitoring systems, encompassing diverse traffic demands and air quality assessments. Despite significant strides in ST modeling in recent years, there remains a need to emphasize harnessing the potential of information from different modalities. Robust MoST forecasting is more challenging because it possesses (i) high-dimensional and complex internal structures and (ii) dynamic heterogeneity caused by temporal, spatial, and modality variations. In this study, we propose a novel MoST learning framework via Self-Supervised Learning, namely MoSSL, which aims to uncover latent patterns from temporal, spatial, and modality perspectives while quantifying dynamic heterogeneity. Experiment results on two real-world MoST datasets verify the superiority of our approach compared with the state-of-the-art baselines. Model implementation is available at https://github.com/beginner-sketch/MoSSL.