CVFeb 10

4RC: 4D Reconstruction via Conditional Querying Anytime and Anywhere

arXiv:2602.10094v13 citationsh-index: 13
Originality Highly original
AI Analysis

This addresses the problem of holistic 4D reconstruction for computer vision applications, representing an incremental improvement over existing decoupled or limited approaches.

The paper tackles 4D reconstruction from monocular videos by introducing 4RC, a unified feed-forward framework that jointly captures dense scene geometry and motion dynamics, outperforming prior methods across various tasks.

We present 4RC, a unified feed-forward framework for 4D reconstruction from monocular videos. Unlike existing approaches that typically decouple motion from geometry or produce limited 4D attributes such as sparse trajectories or two-view scene flow, 4RC learns a holistic 4D representation that jointly captures dense scene geometry and motion dynamics. At its core, 4RC introduces a novel encode-once, query-anywhere and anytime paradigm: a transformer backbone encodes the entire video into a compact spatio-temporal latent space, from which a conditional decoder can efficiently query 3D geometry and motion for any query frame at any target timestamp. To facilitate learning, we represent per-view 4D attributes in a minimally factorized form by decomposing them into base geometry and time-dependent relative motion. Extensive experiments demonstrate that 4RC outperforms prior and concurrent methods across a wide range of 4D reconstruction tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes