Spatiotemporal Forecasting as Planning: A Model-Based Reinforcement Learning Approach with Generative World Models
This addresses forecasting problems in physical domains (e.g., weather or traffic) where traditional methods struggle with uncertainty and complex evaluation metrics, representing a new paradigm rather than an incremental improvement.
The paper tackles the challenges of stochasticity and non-differentiable metrics in spatiotemporal forecasting by proposing Spatiotemporal Forecasting as Planning (SFP), a model-based reinforcement learning approach that uses a generative world model and planning algorithm to reduce prediction error and improve performance on domain metrics like extreme events.
To address the dual challenges of inherent stochasticity and non-differentiable metrics in physical spatiotemporal forecasting, we propose Spatiotemporal Forecasting as Planning (SFP), a new paradigm grounded in Model-Based Reinforcement Learning. SFP constructs a novel Generative World Model to simulate diverse, high-fidelity future states, enabling an "imagination-based" environmental simulation. Within this framework, a base forecasting model acts as an agent, guided by a beam search-based planning algorithm that leverages non-differentiable domain metrics as reward signals to explore high-return future sequences. These identified high-reward candidates then serve as pseudo-labels to continuously optimize the agent's policy through iterative self-training, significantly reducing prediction error and demonstrating exceptional performance on critical domain metrics like capturing extreme events.