CVNov 29, 2023

Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving

arXiv:2311.17918v1325 citationsh-index: 23Has Code
Originality Incremental advance
AI Analysis

This addresses safety and efficiency in autonomous driving by enabling simulation and planning, but it is incremental as it builds on existing end-to-end models.

The paper tackles the problem of predicting future events for autonomous driving by proposing Drive-WM, a world model that generates high-fidelity multiview videos and enables planning based on image-based rewards, with evaluation on real-world datasets showing it can produce high-quality and controllable videos.

In autonomous driving, predicting future events in advance and evaluating the foreseeable risks empowers autonomous vehicles to better plan their actions, enhancing safety and efficiency on the road. To this end, we propose Drive-WM, the first driving world model compatible with existing end-to-end planning models. Through a joint spatial-temporal modeling facilitated by view factorization, our model generates high-fidelity multiview videos in driving scenes. Building on its powerful generation ability, we showcase the potential of applying the world model for safe driving planning for the first time. Particularly, our Drive-WM enables driving into multiple futures based on distinct driving maneuvers, and determines the optimal trajectory according to the image-based rewards. Evaluation on real-world driving datasets verifies that our method could generate high-quality, consistent, and controllable multiview videos, opening up possibilities for real-world simulations and safe planning.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes