CVDec 30, 2022

Imitator: Personalized Speech-driven 3D Facial Animation

arXiv:2301.00023v198 citationsh-index: 110
Originality Incremental advance
AI Analysis

This work improves speech-driven facial animation for applications like gaming and virtual reality by personalizing animations to individual actors, though it is incremental as it builds on prior transformer-based methods.

The paper tackled the problem of generating realistic 3D facial animations from speech by addressing the lack of identity-specific speaking styles and facial idiosyncrasies in existing methods, resulting in improved lip movements and realism as validated through experiments and a user study.

Speech-driven 3D facial animation has been widely explored, with applications in gaming, character animation, virtual reality, and telepresence systems. State-of-the-art methods deform the face topology of the target actor to sync the input audio without considering the identity-specific speaking style and facial idiosyncrasies of the target actor, thus, resulting in unrealistic and inaccurate lip movements. To address this, we present Imitator, a speech-driven facial expression synthesis method, which learns identity-specific details from a short input video and produces novel facial expressions matching the identity-specific speaking style and facial idiosyncrasies of the target actor. Specifically, we train a style-agnostic transformer on a large facial expression dataset which we use as a prior for audio-driven facial expressions. Based on this prior, we optimize for identity-specific speaking style based on a short reference video. To train the prior, we introduce a novel loss function based on detected bilabial consonants to ensure plausible lip closures and consequently improve the realism of the generated expressions. Through detailed experiments and a user study, we show that our approach produces temporally coherent facial expressions from input audio while preserving the speaking style of the target actors.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes