CVJun 12, 2024

Enhancing End-to-End Autonomous Driving with Latent World Model

Yingyan Li, Lue Fan, Jiawei He, Yuqi Wang, Yuntao Chen, Zhaoxiang Zhang, Tieniu Tan

arXiv:2406.08481v235.8135 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the challenge of better leveraging sensor data in autonomous driving systems, though it appears incremental as it builds on existing self-supervised learning methods.

The paper tackles the problem of improving scene feature representations for end-to-end autonomous driving planners by proposing a self-supervised learning approach using a latent world model (LAW) that predicts future scene features. It achieves state-of-the-art performance on benchmarks like nuScenes, NAVSIM, and CARLA.

In autonomous driving, end-to-end planners directly utilize raw sensor data, enabling them to extract richer scene features and reduce information loss compared to traditional planners. This raises a crucial research question: how can we develop better scene feature representations to fully leverage sensor data in end-to-end driving? Self-supervised learning methods show great success in learning rich feature representations in NLP and computer vision. Inspired by this, we propose a novel self-supervised learning approach using the LAtent World model (LAW) for end-to-end driving. LAW predicts future scene features based on current features and ego trajectories. This self-supervised task can be seamlessly integrated into perception-free and perception-based frameworks, improving scene feature learning and optimizing trajectory prediction. LAW achieves state-of-the-art performance across multiple benchmarks, including real-world open-loop benchmark nuScenes, NAVSIM, and simulator-based closed-loop benchmark CARLA. The code is released at https://github.com/BraveGroup/LAW.

View on arXiv PDF Code

Similar