CV AIDec 6, 2024

MotionFlow: Attention-Driven Motion Transfer in Video Diffusion Models

Tuna Han Salih Meral, Hidir Yesiltepe, Connor Dunlop, Pinar Yanardag

arXiv:2412.05275v113.514 citationsh-index: 11

Originality Incremental advance

AI Analysis

This addresses the limitation of motion control in video diffusion models for practical applications, representing a novel method for a known bottleneck.

The paper tackles the problem of fine-grained motion control in text-to-video models by introducing MotionFlow, a framework that uses cross-attention maps for motion transfer without training, resulting in significant outperformance in fidelity and versatility over existing models.

Text-to-video models have demonstrated impressive capabilities in producing diverse and captivating video content, showcasing a notable advancement in generative AI. However, these models generally lack fine-grained control over motion patterns, limiting their practical applicability. We introduce MotionFlow, a novel framework designed for motion transfer in video diffusion models. Our method utilizes cross-attention maps to accurately capture and manipulate spatial and temporal dynamics, enabling seamless motion transfers across various contexts. Our approach does not require training and works on test-time by leveraging the inherent capabilities of pre-trained video diffusion models. In contrast to traditional approaches, which struggle with comprehensive scene changes while maintaining consistent motion, MotionFlow successfully handles such complex transformations through its attention-based mechanism. Our qualitative and quantitative experiments demonstrate that MotionFlow significantly outperforms existing models in both fidelity and versatility even during drastic scene alterations.

View on arXiv PDF

Similar