CVApr 23

Vista4D: Video Reshooting with 4D Point Clouds

arXiv:2604.2191599.02 citations
Predicted impact top 2% in CV · last 90 daysOriginality Incremental advance
AI Analysis

For researchers and practitioners in video editing and novel view synthesis, this method addresses the challenge of robustly reshooting real-world dynamic videos with precise camera control.

Vista4D introduces a video reshooting framework that uses a 4D point cloud representation to re-synthesize dynamic scenes from new camera trajectories, achieving improved 4D consistency, camera control, and visual quality over state-of-the-art baselines.

We present Vista4D, a robust and flexible video reshooting framework that grounds the input video and target cameras in a 4D point cloud. Specifically, given an input video, our method re-synthesizes the scene with the same dynamics from a different camera trajectory and viewpoint. Existing video reshooting methods often struggle with depth estimation artifacts of real-world dynamic videos, while also failing to preserve content appearance and failing to maintain precise camera control for challenging new trajectories. We build a 4D-grounded point cloud representation with static pixel segmentation and 4D reconstruction to explicitly preserve seen content and provide rich camera signals, and we train with reconstructed multiview dynamic data for robustness against point cloud artifacts during real-world inference. Our results demonstrate improved 4D consistency, camera control, and visual quality compared to state-of-the-art baselines under a variety of videos and camera paths. Moreover, our method generalizes to real-world applications such as dynamic scene expansion and 4D scene recomposition. See our project page for results, code, and models: https://eyeline-labs.github.io/Vista4D

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes