CVAILGMar 22, 2024

Spectral Motion Alignment for Video Motion Transfer using Diffusion Models

arXiv:2403.15249v221 citationsh-index: 12AAAI
AI Analysis

This work addresses motion transfer challenges in video customization for users of text-to-video diffusion models, representing an incremental improvement over existing methods.

The paper tackled the problem of inaccurate motion distillation in video motion transfer using diffusion models, and the result was a novel framework called Spectral Motion Alignment (SMA) that improved motion transfer efficacy while maintaining computational efficiency and compatibility across frameworks.

The evolution of diffusion models has greatly impacted video generation and understanding. Particularly, text-to-video diffusion models (VDMs) have significantly facilitated the customization of input video with target appearance, motion, etc. Despite these advances, challenges persist in accurately distilling motion information from video frames. While existing works leverage the consecutive frame residual as the target motion vector, they inherently lack global motion context and are vulnerable to frame-wise distortions. To address this, we present Spectral Motion Alignment (SMA), a novel framework that refines and aligns motion vectors using Fourier and wavelet transforms. SMA learns motion patterns by incorporating frequency-domain regularization, facilitating the learning of whole-frame global motion dynamics, and mitigating spatial artifacts. Extensive experiments demonstrate SMA's efficacy in improving motion transfer while maintaining computational efficiency and compatibility across various video customization frameworks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes