CVMar 24, 2025

HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation

Tsinghua
arXiv:2503.18860v253 citationsh-index: 13CVPR
Originality Incremental advance
AI Analysis

This work addresses the challenge of creating realistic and controllable portrait animations for applications in entertainment and media, representing an incremental improvement over prior diffusion-based techniques.

The paper tackles the problem of generating lifelike portrait animations from a single reference image and driving videos by introducing HunyuanPortrait, a diffusion-based method that uses implicit representations for control, resulting in superior temporal consistency and controllability compared to existing methods.

We introduce HunyuanPortrait, a diffusion-based condition control method that employs implicit representations for highly controllable and lifelike portrait animation. Given a single portrait image as an appearance reference and video clips as driving templates, HunyuanPortrait can animate the character in the reference image by the facial expression and head pose of the driving videos. In our framework, we utilize pre-trained encoders to achieve the decoupling of portrait motion information and identity in videos. To do so, implicit representation is adopted to encode motion information and is employed as control signals in the animation phase. By leveraging the power of stable video diffusion as the main building block, we carefully design adapter layers to inject control signals into the denoising unet through attention mechanisms. These bring spatial richness of details and temporal consistency. HunyuanPortrait also exhibits strong generalization performance, which can effectively disentangle appearance and motion under different image styles. Our framework outperforms existing methods, demonstrating superior temporal consistency and controllability. Our project is available at https://kkakkkka.github.io/HunyuanPortrait.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes