CVAug 13, 2025

LIA-X: Interpretable Latent Portrait Animator

arXiv:2508.09959v14 citationsh-index: 30
Originality Incremental advance
AI Analysis

This addresses the problem of precise and interpretable portrait animation for applications like video editing and 3D manipulation, but it appears incremental as it builds on existing motion transfer methods with a novel control strategy.

The paper tackled the problem of transferring facial dynamics from a driving video to a source portrait with fine-grained control, and the result was that LIA-X outperformed previous approaches in self-reenactment and cross-reenactment tasks across several benchmarks.

We introduce LIA-X, a novel interpretable portrait animator designed to transfer facial dynamics from a driving video to a source portrait with fine-grained control. LIA-X is an autoencoder that models motion transfer as a linear navigation of motion codes in latent space. Crucially, it incorporates a novel Sparse Motion Dictionary that enables the model to disentangle facial dynamics into interpretable factors. Deviating from previous 'warp-render' approaches, the interpretability of the Sparse Motion Dictionary allows LIA-X to support a highly controllable 'edit-warp-render' strategy, enabling precise manipulation of fine-grained facial semantics in the source portrait. This helps to narrow initial differences with the driving video in terms of pose and expression. Moreover, we demonstrate the scalability of LIA-X by successfully training a large-scale model with approximately 1 billion parameters on extensive datasets. Experimental results show that our proposed method outperforms previous approaches in both self-reenactment and cross-reenactment tasks across several benchmarks. Additionally, the interpretable and controllable nature of LIA-X supports practical applications such as fine-grained, user-guided image and video editing, as well as 3D-aware portrait video manipulation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes