CVJun 16, 2025

Vid-CamEdit: Video Camera Trajectory Editing with Generative Rendering from Estimated Geometry

Junyoung Seo, Jisang Han, Jaewoo Jung, Siyoon Jin, Joungbin Lee, Takuya Narihira, Kazumi Fukuda, Takashi Shibuya, Donghoon Ahn, Shoukang Hu, Seungryong Kim, Yuki Mitsufuji

arXiv:2506.13697v111.84 citationsh-index: 31

Originality Incremental advance

AI Analysis

This work addresses the challenge of video camera trajectory editing for in-the-wild videos, which is incremental as it builds on existing methods by integrating geometry estimation and generative rendering.

The authors tackled the problem of editing camera trajectories in monocular videos to synthesize new views along user-defined paths, achieving improved performance in generating plausible videos, particularly for extreme extrapolation on real-world footage.

We introduce Vid-CamEdit, a novel framework for video camera trajectory editing, enabling the re-synthesis of monocular videos along user-defined camera paths. This task is challenging due to its ill-posed nature and the limited multi-view video data for training. Traditional reconstruction methods struggle with extreme trajectory changes, and existing generative models for dynamic novel view synthesis cannot handle in-the-wild videos. Our approach consists of two steps: estimating temporally consistent geometry, and generative rendering guided by this geometry. By integrating geometric priors, the generative model focuses on synthesizing realistic details where the estimated geometry is uncertain. We eliminate the need for extensive 4D training data through a factorized fine-tuning framework that separately trains spatial and temporal components using multi-view image and video data. Our method outperforms baselines in producing plausible videos from novel camera trajectories, especially in extreme extrapolation scenarios on real-world footage.

View on arXiv PDF

Similar