CVNov 6, 2023

Consistent4D: Consistent 360° Dynamic Object Generation from Monocular Video

arXiv:2311.02848v1129 citationsh-index: 5
Originality Incremental advance
AI Analysis

This addresses the challenge of creating 360-degree dynamic 3D models from single-view videos, eliminating the need for multi-view data and camera calibration, which is incremental but opens new possibilities in 4D generation.

The paper tackles the problem of generating 4D dynamic objects from uncalibrated monocular videos, achieving competitive performance to prior methods and demonstrating advantages for text-to-3D generation tasks.

In this paper, we present Consistent4D, a novel approach for generating 4D dynamic objects from uncalibrated monocular videos. Uniquely, we cast the 360-degree dynamic object reconstruction as a 4D generation problem, eliminating the need for tedious multi-view data collection and camera calibration. This is achieved by leveraging the object-level 3D-aware image diffusion model as the primary supervision signal for training Dynamic Neural Radiance Fields (DyNeRF). Specifically, we propose a Cascade DyNeRF to facilitate stable convergence and temporal continuity under the supervision signal which is discrete along the time axis. To achieve spatial and temporal consistency, we further introduce an Interpolation-driven Consistency Loss. It is optimized by minimizing the discrepancy between rendered frames from DyNeRF and interpolated frames from a pre-trained video interpolation model. Extensive experiments show that our Consistent4D can perform competitively to prior art alternatives, opening up new possibilities for 4D dynamic object generation from monocular videos, whilst also demonstrating advantage for conventional text-to-3D generation tasks. Our project page is https://consistent4d.github.io/.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes