CVApr 4

SymphoMotion: Joint Control of Camera Motion and Object Dynamics for Coherent Video Generation

arXiv:2604.0372386.1h-index: 2
AI Analysis

This work provides a novel solution for coherent video generation with joint control of camera and object motion, addressing a key limitation in current video generation models.

SymphoMotion introduces a unified framework for jointly controlling camera motion and object dynamics in video generation, outperforming existing methods in visual fidelity, camera controllability, and object-motion accuracy. The method includes a new dataset, RealCOD-25K, to address data gaps.

Controlling both camera motion and object dynamics is essential for coherent and expressive video generation, yet current methods typically handle only one motion type or rely on ambiguous 2D cues that entangle camera-induced parallax with true object movement. We present SymphoMotion, a unified motion-control framework that jointly governs camera trajectories and object dynamics within a single model. SymphoMotion features a Camera Trajectory Control mechanism that integrates explicit camera paths with geometry-aware cues to ensure stable, structurally consistent viewpoint transitions, and an Object Dynamics Control mechanism that combines 2D visual guidance with 3D trajectory embeddings to enable depth-aware, spatially coherent object manipulation. To support large-scale training and evaluation, we further construct RealCOD-25K, a comprehensive real-world dataset containing paired camera poses and object-level 3D trajectories across diverse indoor and outdoor scenes, addressing a key data gap in unified motion control. Extensive experiments and user studies show that SymphoMotion significantly outperforms existing methods in visual fidelity, camera controllability, and object-motion accuracy, establishing a new benchmark for unified motion control in video generation.Codes and data are publicly available at https://grenoble-zhang.github.io/SymphoMotion/.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes