CVJan 17, 2025

X-Dyna: Expressive Dynamic Human Image Animation

Stanford
arXiv:2501.10021v219 citationsh-index: 11Has CodeCVPR
Originality Highly original
AI Analysis

This addresses the challenge of creating lifelike human video animations for applications in entertainment, virtual reality, or content creation, representing a strong incremental improvement over prior pose-based methods.

The paper tackles the problem of animating a single human image using facial expressions and body movements from a driving video, resulting in a method that outperforms state-of-the-art approaches in generating realistic and expressive animations.

We introduce X-Dyna, a novel zero-shot, diffusion-based pipeline for animating a single human image using facial expressions and body movements derived from a driving video, that generates realistic, context-aware dynamics for both the subject and the surrounding environment. Building on prior approaches centered on human pose control, X-Dyna addresses key shortcomings causing the loss of dynamic details, enhancing the lifelike qualities of human video animations. At the core of our approach is the Dynamics-Adapter, a lightweight module that effectively integrates reference appearance context into the spatial attentions of the diffusion backbone while preserving the capacity of motion modules in synthesizing fluid and intricate dynamic details. Beyond body pose control, we connect a local control module with our model to capture identity-disentangled facial expressions, facilitating accurate expression transfer for enhanced realism in animated scenes. Together, these components form a unified framework capable of learning physical human motion and natural scene dynamics from a diverse blend of human and scene videos. Comprehensive qualitative and quantitative evaluations demonstrate that X-Dyna outperforms state-of-the-art methods, creating highly lifelike and expressive animations. The code is available at https://github.com/bytedance/X-Dyna.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes