CVApr 11, 2025

RealCam-Vid: High-resolution Video Dataset with Dynamic Scenes and Metric-scale Camera Movements

arXiv:2504.08212v16 citationsh-index: 5Has Code
Originality Incremental advance
AI Analysis

This dataset addresses a bottleneck for researchers in video generation by providing metric-scale geometric consistency in dynamic scenes, which is incremental as it builds on prior static-scene datasets.

The authors tackled the limitation of existing datasets for camera-controllable video generation by introducing RealCam-Vid, the first fully open-source, high-resolution dynamic-scene dataset with metric-scale camera annotations, enabling more realistic object motions and precise camera trajectories.

Recent advances in camera-controllable video generation have been constrained by the reliance on static-scene datasets with relative-scale camera annotations, such as RealEstate10K. While these datasets enable basic viewpoint control, they fail to capture dynamic scene interactions and lack metric-scale geometric consistency-critical for synthesizing realistic object motions and precise camera trajectories in complex environments. To bridge this gap, we introduce the first fully open-source, high-resolution dynamic-scene dataset with metric-scale camera annotations in https://github.com/ZGCTroy/RealCam-Vid.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes