CVAIMay 8, 2024

Traj-LLM: A New Exploration for Empowering Trajectory Prediction with Pre-trained Large Language Models

arXiv:2405.04909v13 citationsh-index: 11
Originality Highly original
AI Analysis

This addresses the problem of scene cognitive and understanding gaps in trajectory prediction for autonomous driving, offering a novel approach that is more universal and adaptable.

The paper tackles trajectory prediction in autonomous driving by proposing Traj-LLM, which leverages pre-trained large language models without explicit prompt engineering to generate future motion from past trajectories and scene semantics, achieving state-of-the-art performance across evaluation metrics and outperforming benchmarks with only 50% of the dataset.

Predicting the future trajectories of dynamic traffic actors is a cornerstone task in autonomous driving. Though existing notable efforts have resulted in impressive performance improvements, a gap persists in scene cognitive and understanding of the complex traffic semantics. This paper proposes Traj-LLM, the first to investigate the potential of using Large Language Models (LLMs) without explicit prompt engineering to generate future motion from agents' past/observed trajectories and scene semantics. Traj-LLM starts with sparse context joint coding to dissect the agent and scene features into a form that LLMs understand. On this basis, we innovatively explore LLMs' powerful comprehension abilities to capture a spectrum of high-level scene knowledge and interactive information. Emulating the human-like lane focus cognitive function and enhancing Traj-LLM's scene comprehension, we introduce lane-aware probabilistic learning powered by the pioneering Mamba module. Finally, a multi-modal Laplace decoder is designed to achieve scene-compliant multi-modal predictions. Extensive experiments manifest that Traj-LLM, fortified by LLMs' strong prior knowledge and understanding prowess, together with lane-aware probability learning, outstrips state-of-the-art methods across evaluation metrics. Moreover, the few-shot analysis further substantiates Traj-LLM's performance, wherein with just 50% of the dataset, it outperforms the majority of benchmarks relying on complete data utilization. This study explores equipping the trajectory prediction task with advanced capabilities inherent in LLMs, furnishing a more universal and adaptable solution for forecasting agent motion in a new way.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes