CVGRMMApr 13, 2022

Dynamic Neural Textures: Generating Talking-Face Videos with Continuously Controllable Expressions

Tsinghua
arXiv:2204.06180v111 citationsh-index: 142
Originality Incremental advance
AI Analysis

This work addresses the need for controllable expression generation in talking-face videos, which is incremental as it builds on existing methods by adding explicit control over expression intensity and details like teeth.

The paper tackles the problem of generating talking-face videos with controllable expressions by proposing a method that uses dynamic neural textures and continuous intensity expression coding to achieve real-time, high-quality video generation with continuously adjustable expressions, outperforming four baseline methods in experiments and a user study.

Recently, talking-face video generation has received considerable attention. So far most methods generate results with neutral expressions or expressions that are implicitly determined by neural networks in an uncontrollable way. In this paper, we propose a method to generate talking-face videos with continuously controllable expressions in real-time. Our method is based on an important observation: In contrast to facial geometry of moderate resolution, most expression information lies in textures. Then we make use of neural textures to generate high-quality talking face videos and design a novel neural network that can generate neural textures for image frames (which we called dynamic neural textures) based on the input expression and continuous intensity expression coding (CIEC). Our method uses 3DMM as a 3D model to sample the dynamic neural texture. The 3DMM does not cover the teeth area, so we propose a teeth submodule to complete the details in teeth. Results and an ablation study show the effectiveness of our method in generating high-quality talking-face videos with continuously controllable expressions. We also set up four baseline methods by combining existing representative methods and compare them with our method. Experimental results including a user study show that our method has the best performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes