CVNov 26, 2024

D$^2$-World: An Efficient World Model through Decoupled Dynamic Flow

arXiv:2411.17027v11 citationsh-index: 9Has Code
Originality Incremental advance
AI Analysis

This is an incremental improvement for autonomous systems, focusing on efficient world modeling for point cloud prediction.

The paper tackles the problem of forecasting future point clouds for autonomous systems by introducing D²-World, a world model that uses decoupled dynamic flow to predict future occupancy in a non-autoregressive manner, achieving state-of-the-art performance on the OpenScene benchmark and training over 300% faster than the baseline.

This technical report summarizes the second-place solution for the Predictive World Model Challenge held at the CVPR-2024 Workshop on Foundation Models for Autonomous Systems. We introduce D$^2$-World, a novel World model that effectively forecasts future point clouds through Decoupled Dynamic flow. Specifically, the past semantic occupancies are obtained via existing occupancy networks (e.g., BEVDet). Following this, the occupancy results serve as the input for a single-stage world model, generating future occupancy in a non-autoregressive manner. To further simplify the task, dynamic voxel decoupling is performed in the world model. The model generates future dynamic voxels by warping the existing observations through voxel flow, while remaining static voxels can be easily obtained through pose transformation. As a result, our approach achieves state-of-the-art performance on the OpenScene Predictive World Model benchmark, securing second place, and trains more than 300% faster than the baseline model. Code is available at https://github.com/zhanghm1995/D2-World.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes