RO CV LGJan 28, 2025

Dream to Drive with Predictive Individual World Model

Yinfeng Gao, Qichao Zhang, Da-wei Ding, Dongbin Zhao

arXiv:2501.16733v118.714 citationsh-index: 26Has CodeIEEE Trans Intell Veh

Originality Incremental advance

AI Analysis

This work addresses autonomous driving by improving reactive policies through better modeling of vehicle interactions, though it appears incremental as it builds on model-based reinforcement learning with a focus on individual-level representation.

The paper tackles the challenge of reactive driving in complex urban environments by introducing a predictive individual world model (PIWM) that captures vehicle interactions and intentions, achieving the best performance in safety and efficiency compared to existing methods.

It is still a challenging topic to make reactive driving behaviors in complex urban environments as road users' intentions are unknown. Model-based reinforcement learning (MBRL) offers great potential to learn a reactive policy by constructing a world model that can provide informative states and imagination training. However, a critical limitation in relevant research lies in the scene-level reconstruction representation learning, which may overlook key interactive vehicles and hardly model the interactive features among vehicles and their long-term intentions. Therefore, this paper presents a novel MBRL method with a predictive individual world model (PIWM) for autonomous driving. PIWM describes the driving environment from an individual-level perspective and captures vehicles' interactive relations and their intentions via trajectory prediction task. Meanwhile, a behavior policy is learned jointly with PIWM. It is trained in PIWM's imagination and effectively navigates in the urban driving scenes leveraging intention-aware latent states. The proposed method is trained and evaluated on simulation environments built upon real-world challenging interactive scenarios. Compared with popular model-free and state-of-the-art model-based reinforcement learning methods, experimental results show that the proposed method achieves the best performance in terms of safety and efficiency.

View on arXiv PDF Code

Similar