CVFeb 27, 2025

High-Fidelity Relightable Monocular Portrait Animation with Lighting-Controllable Video Diffusion Model

arXiv:2502.19894v19 citationsh-index: 11CVPR
Originality Highly original
AI Analysis

This addresses the challenge of creating realistic, lighting-controllable animated portraits for applications in digital media and entertainment, representing a novel method rather than an incremental improvement.

The paper tackles the problem of relightable portrait animation by separating intrinsic and extrinsic features, achieving high-fidelity animations with user-specified lighting. Results show it outperforms state-of-the-art methods in lighting realism, image quality, and video consistency.

Relightable portrait animation aims to animate a static reference portrait to match the head movements and expressions of a driving video while adapting to user-specified or reference lighting conditions. Existing portrait animation methods fail to achieve relightable portraits because they do not separate and manipulate intrinsic (identity and appearance) and extrinsic (pose and lighting) features. In this paper, we present a Lighting Controllable Video Diffusion model (LCVD) for high-fidelity, relightable portrait animation. We address this limitation by distinguishing these feature types through dedicated subspaces within the feature space of a pre-trained image-to-video diffusion model. Specifically, we employ the 3D mesh, pose, and lighting-rendered shading hints of the portrait to represent the extrinsic attributes, while the reference represents the intrinsic attributes. In the training phase, we employ a reference adapter to map the reference into the intrinsic feature subspace and a shading adapter to map the shading hints into the extrinsic feature subspace. By merging features from these subspaces, the model achieves nuanced control over lighting, pose, and expression in generated animations. Extensive evaluations show that LCVD outperforms state-of-the-art methods in lighting realism, image quality, and video consistency, setting a new benchmark in relightable portrait animation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes