CVDec 2, 2024

InfinityDrive: Breaking Time Limits in Driving World Models

arXiv:2412.01522v217 citationsh-index: 4
Originality Incremental advance
AI Analysis

This addresses the challenge of generating diverse and extensive driving data for autonomous driving systems, though it appears incremental as it builds on existing world model approaches.

The paper tackles the problem of limited scenario diversity and short time windows in driving world models by introducing InfinityDrive, which achieves state-of-the-art performance in generating high-resolution, consistent videos lasting over 1500 frames (more than 2 minutes).

Autonomous driving systems struggle with complex scenarios due to limited access to diverse, extensive, and out-of-distribution driving data which are critical for safe navigation. World models offer a promising solution to this challenge; however, current driving world models are constrained by short time windows and limited scenario diversity. To bridge this gap, we introduce InfinityDrive, the first driving world model with exceptional generalization capabilities, delivering state-of-the-art performance in high fidelity, consistency, and diversity with minute-scale video generation. InfinityDrive introduces an efficient spatio-temporal co-modeling module paired with an extended temporal training strategy, enabling high-resolution (576$\times$1024) video generation with consistent spatial and temporal coherence. By incorporating memory injection and retention mechanisms alongside an adaptive memory curve loss to minimize cumulative errors, achieving consistent video generation lasting over 1500 frames (more than 2 minutes). Comprehensive experiments in multiple datasets validate InfinityDrive's ability to generate complex and varied scenarios, highlighting its potential as a next-generation driving world model built for the evolving demands of autonomous driving. Our project homepage: https://metadrivescape.github.io/papers_project/InfinityDrive/page.html

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes