CVSep 3, 2022

Synthesizing Photorealistic Virtual Humans Through Cross-modal Disentanglement

arXiv:2209.01320v25 citationsh-index: 6
AI Analysis

This addresses the need for high-quality, real-time virtual human avatars in applications like digital assistants and the metaverse, representing an incremental improvement over existing deepfake and talking-head generation methods.

The paper tackles the problem of generating photorealistic virtual human faces with accurate lip motion for practical use, proposing an end-to-end framework that runs in real-time and delivers superior results compared to state-of-the-art methods.

Over the last few decades, many aspects of human life have been enhanced with virtual domains, from the advent of digital assistants such as Amazon's Alexa and Apple's Siri to the latest metaverse efforts of the rebranded Meta. These trends underscore the importance of generating photorealistic visual depictions of humans. This has led to the rapid growth of so-called deepfake and talking-head generation methods in recent years. Despite their impressive results and popularity, they usually lack certain qualitative aspects such as texture quality, lips synchronization, or resolution, and practical aspects such as the ability to run in real-time. To allow for virtual human avatars to be used in practical scenarios, we propose an end-to-end framework for synthesizing high-quality virtual human faces capable of speaking with accurate lip motion with a special emphasis on performance. We introduce a novel network utilizing visemes as an intermediate audio representation and a novel data augmentation strategy employing a hierarchical image synthesis approach that allows disentanglement of the different modalities used to control the global head motion. Our method runs in real-time, and is able to deliver superior results compared to the current state-of-the-art.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes