CVJun 8, 2024

MotionClone: Training-Free Motion Cloning for Controllable Video Generation

arXiv:2406.05338v6111 citations
Originality Incremental advance
AI Analysis

This work addresses the need for more flexible and efficient motion control in video generation for content creators, though it is incremental as it builds on existing temporal attention methods.

The paper tackles the problem of limited flexibility and generalization in motion-based controllable video generation by proposing MotionClone, a training-free framework that clones motion from reference videos to generate text-to-video and image-to-video content, achieving notable superiority in motion fidelity, textual alignment, and temporal consistency.

Motion-based controllable video generation offers the potential for creating captivating visual content. Existing methods typically necessitate model training to encode particular motion cues or incorporate fine-tuning to inject certain motion patterns, resulting in limited flexibility and generalization. In this work, we propose MotionClone, a training-free framework that enables motion cloning from reference videos to versatile motion-controlled video generation, including text-to-video and image-to-video. Based on the observation that the dominant components in temporal-attention maps drive motion synthesis, while the rest mainly capture noisy or very subtle motions, MotionClone utilizes sparse temporal attention weights as motion representations for motion guidance, facilitating diverse motion transfer across varying scenarios. Meanwhile, MotionClone allows for the direct extraction of motion representation through a single denoising step, bypassing the cumbersome inversion processes and thus promoting both efficiency and flexibility. Extensive experiments demonstrate that MotionClone exhibits proficiency in both global camera motion and local object motion, with notable superiority in terms of motion fidelity, textual alignment, and temporal consistency.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes