CVMay 14, 2020

DeepFaceFlow: In-the-wild Dense 3D Facial Motion Estimation

arXiv:2005.07298v112 citations
Originality Incremental advance
AI Analysis

This addresses the problem of capturing detailed facial dynamics for applications like expression recognition and reenactment, though it appears incremental as it builds on existing flow and 3D reconstruction methods.

The paper tackled dense 3D facial motion estimation from monocular in-the-wild image pairs, achieving improved performance over state-of-the-art methods and enabling highly-realistic facial video synthesis at about 60 fps.

Dense 3D facial motion capture from only monocular in-the-wild pairs of RGB images is a highly challenging problem with numerous applications, ranging from facial expression recognition to facial reenactment. In this work, we propose DeepFaceFlow, a robust, fast, and highly-accurate framework for the dense estimation of 3D non-rigid facial flow between pairs of monocular images. Our DeepFaceFlow framework was trained and tested on two very large-scale facial video datasets, one of them of our own collection and annotation, with the aid of occlusion-aware and 3D-based loss function. We conduct comprehensive experiments probing different aspects of our approach and demonstrating its improved performance against state-of-the-art flow and 3D reconstruction methods. Furthermore, we incorporate our framework in a full-head state-of-the-art facial video synthesis method and demonstrate the ability of our method in better representing and capturing the facial dynamics, resulting in a highly-realistic facial video synthesis. Given registered pairs of images, our framework generates 3D flow maps at ~60 fps.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes