RetroMotion: Retrocausal Motion Forecasting Models are Instructable
This work addresses the challenge of scalable multi-agent motion forecasting for autonomous driving, offering a novel decomposition and instruction-following capability.
The authors decompose multi-agent motion forecasting into marginal and joint distributions using a transformer with retrocausal flow, achieving strong results in the Waymo Interaction Prediction Challenge and generalizing to Argoverse 2 and V2X-Seq datasets. The method also provides an interface for issuing instructions.
Motion forecasts of road users (i.e., agents) vary in complexity depending on the number of agents, scene constraints, and interactions. In particular, the output space of joint trajectory distributions grows exponentially with the number of agents. Therefore, we decompose multi-agent motion forecasts into (1) marginal distributions for all modeled agents and (2) joint distributions for interacting agents. Using a transformer model, we generate joint distributions by re-encoding marginal distributions followed by pairwise modeling. This incorporates a retrocausal flow of information from later points in marginal trajectories to earlier points in joint trajectories. For each time step, we model the positional uncertainty using compressed exponential power distributions. Notably, our method achieves strong results in the Waymo Interaction Prediction Challenge and generalizes well to the Argoverse 2 and V2X-Seq datasets. Additionally, our method provides an interface for issuing instructions. We show that standard motion forecasting training implicitly enables the model to follow instructions and adapt them to the scene context. GitHub repository: https://github.com/kit-mrt/future-motion