CVLGApr 15, 2025

GC-GAT: Multimodal Vehicular Trajectory Prediction using Graph Goal Conditioning and Cross-context Attention

arXiv:2504.11150v22 citationsh-index: 6IEEE Robot Autom Lett
Originality Highly original
AI Analysis

This work addresses trajectory prediction for autonomous vehicles, representing an incremental improvement with a novel method for a known bottleneck.

The paper tackles multimodal vehicular trajectory prediction by proposing a model that predicts graph-based goal proposals and fuses them with cross-context attention, achieving state-of-the-art results on the nuScenes dataset.

Predicting future trajectories of surrounding vehicles heavily relies on what contextual information is given to a motion prediction model. The context itself can be static (lanes, regulatory elements, etc) or dynamic (traffic participants). This paper presents a lane graph-based motion prediction model that first predicts graph-based goal proposals and later fuses them with cross attention over multiple contextual elements. We follow the famous encoder-interactor-decoder architecture where the encoder encodes scene context using lightweight Gated Recurrent Units, the interactor applies cross-context attention over encoded scene features and graph goal proposals, and the decoder regresses multimodal trajectories via Laplacian Mixture Density Network from the aggregated encodings. Using cross-attention over graph-based goal proposals gives robust trajectory estimates since the model learns to attend to future goal-relevant scene elements for the intended agent. We evaluate our work on nuScenes motion prediction dataset, achieving state-of-the-art results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes