CVLGIVJun 17, 2020

Head2Head++: Deep Facial Attributes Re-Targeting

arXiv:2006.10199v250 citations
Originality Incremental advance
AI Analysis

This addresses the problem of realistic facial reenactment for video editing and virtual reality applications, representing an incremental improvement over existing methods.

The paper tackles facial video re-targeting by modifying facial attributes of a target subject using a driving monocular sequence, achieving photo-realistic results with end-to-end reenactment at nearly real-time speed (18 fps).

Facial video re-targeting is a challenging problem aiming to modify the facial attributes of a target subject in a seamless manner by a driving monocular sequence. We leverage the 3D geometry of faces and Generative Adversarial Networks (GANs) to design a novel deep learning architecture for the task of facial and head reenactment. Our method is different to purely 3D model-based approaches, or recent image-based methods that use Deep Convolutional Neural Networks (DCNNs) to generate individual frames. We manage to capture the complex non-rigid facial motion from the driving monocular performances and synthesise temporally consistent videos, with the aid of a sequential Generator and an ad-hoc Dynamics Discriminator network. We conduct a comprehensive set of quantitative and qualitative tests and demonstrate experimentally that our proposed method can successfully transfer facial expressions, head pose and eye gaze from a source video to a target subject, in a photo-realistic and faithful fashion, better than other state-of-the-art methods. Most importantly, our system performs end-to-end reenactment in nearly real-time speed (18 fps).

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes