CVJan 31, 2024

CARFF: Conditional Auto-encoded Radiance Field for 3D Scene Forecasting

Jiezhi Yang, Khushi Desai, Charles Packer, Harshil Bhatia, Nicholas Rhinehart, Rowan McAllister, Joseph Gonzalez

CMU

arXiv:2401.18075v28.74 citationsh-index: 20ECCV

Originality Incremental advance

AI Analysis

This addresses the challenge of 3D scene forecasting for autonomous driving, enabling explainable predictions and planning in uncertain, multi-agent environments.

The paper tackles the problem of predicting future 3D scenes from past 2D ego-centric images, proposing CARFF to map images to latent 3D configurations and predict their evolution, with results demonstrated in autonomous driving scenarios using the CARLA simulator.

We propose CARFF, a method for predicting future 3D scenes given past observations. Our method maps 2D ego-centric images to a distribution over plausible 3D latent scene configurations and predicts the evolution of hypothesized scenes through time. Our latents condition a global Neural Radiance Field (NeRF) to represent a 3D scene model, enabling explainable predictions and straightforward downstream planning. This approach models the world as a POMDP and considers complex scenarios of uncertainty in environmental states and dynamics. Specifically, we employ a two-stage training of Pose-Conditional-VAE and NeRF to learn 3D representations, and auto-regressively predict latent scene representations utilizing a mixture density network. We demonstrate the utility of our method in scenarios using the CARLA driving simulator, where CARFF enables efficient trajectory and contingency planning in complex multi-agent autonomous driving scenarios involving occlusions.

View on arXiv PDF

Similar