CVNov 2, 2019

Self-supervised Deformation Modeling for Facial Expression Editing

arXiv:1911.00735v214 citations
Originality Incremental advance
AI Analysis

This addresses the problem of realistic facial expression editing for computer vision and graphics applications, offering a novel approach but with incremental improvements over existing methods.

The paper tackles facial expression editing by proposing a network that disentangles motion and texture editing, using self-supervised deformation modeling without ground truth annotations, and it improves state-of-the-art performance in both qualitative and quantitative evaluations.

Recent advances in deep generative models have demonstrated impressive results in photo-realistic facial image synthesis and editing. Facial expressions are inherently the result of muscle movement. However, existing neural network-based approaches usually only rely on texture generation to edit expressions and largely neglect the motion information. In this work, we propose a novel end-to-end network that disentangles the task of facial editing into two steps: a " "motion-editing" step and a "texture-editing" step. In the "motion-editing" step, we explicitly model facial movement through image deformation, warping the image into the desired expression. In the "texture-editing" step, we generate necessary textures, such as teeth and shading effects, for a photo-realistic result. Our physically-based task-disentanglement system design allows each step to learn a focused task, removing the need of generating texture to hallucinate motion. Our system is trained in a self-supervised manner, requiring no ground truth deformation annotation. Using Action Units [8] as the representation for facial expression, our method improves the state-of-the-art facial expression editing performance in both qualitative and quantitative evaluations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes