CVLGIVMay 22, 2020

Head2Head: Video-based Neural Head Synthesis

arXiv:2005.10954v176 citations
Originality Incremental advance
AI Analysis

This addresses video-based neural head synthesis for applications like entertainment or communication, but appears incremental as it builds on existing methods with specific improvements.

The paper tackles facial reenactment by proposing a novel machine learning architecture that exploits facial motion structure and enforces temporal consistency, achieving more accurate photo-realistic results than state-of-the-art methods.

In this paper, we propose a novel machine learning architecture for facial reenactment. In particular, contrary to the model-based approaches or recent frame-based methods that use Deep Convolutional Neural Networks (DCNNs) to generate individual frames, we propose a novel method that (a) exploits the special structure of facial motion (paying particular attention to mouth motion) and (b) enforces temporal consistency. We demonstrate that the proposed method can transfer facial expressions, pose and gaze of a source actor to a target video in a photo-realistic fashion more accurately than state-of-the-art methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes