CVMay 29

Learning Global Motion with Compact Gaussians for Feed-Forward 4D Reconstruction

arXiv:2605.3159588.1
Predicted impact top 18% in CV · last 90 daysOriginality Highly original
AI Analysis

This work is significant for researchers in computer vision working on dynamic scene reconstruction and novel-view synthesis, offering a more efficient and robust feed-forward approach.

The paper addresses the challenge of dynamic scene reconstruction from monocular video, which existing feed-forward methods struggle with due to duplicated Gaussians and view-dependent biases. The authors propose C4G, a feed-forward 4D reconstruction framework that uses a compact set of timestamp-conditioned learnable Gaussian query tokens to model globally coherent motion and achieve strong novel-view synthesis performance with significantly fewer Gaussians.

Dynamic scene reconstruction from monocular video remains a fundamental challenge in computer vision. Existing feed-forward methods predict 3D Gaussians pixel-wise for each frame, suffering from duplicated Gaussians and view-dependent biases that hinder effective learning of scene motion. We present C4G, a feed-forward 4D reconstruction framework built upon a compact set of timestamp-conditioned learnable Gaussian query tokens. Each token aggregates corresponding features across the full temporal context and decodes a 3D Gaussian whose position is modulated by the target timestamp, enabling globally coherent motion modeling without per-scene optimization. To capture fine-grained details, we further introduce a video diffusion model-based rendering enhancement module. Since our framework effectively aggregates features into Gaussians, we extend this capability to feature lifting, producing a 4D feature field that supports point tracking and dynamic scene understanding. C4G achieves strong novel-view synthesis performance using significantly fewer Gaussians and without requiring camera poses, while exhibiting stronger motion modeling and robustness to large temporal gaps.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes