CVMLMar 25, 2018

Unsupervised Depth Estimation, 3D Face Rotation and Replacement

arXiv:1803.09202v542 citations
Originality Incremental advance
AI Analysis

This addresses the problem of 3D face manipulation without labeled data for computer vision researchers, though it is incremental with hybrid methods and post-processing.

The paper tackles unsupervised 3D facial structure estimation from a single image, enabling tasks like face frontalization and pose transformation without ground-truth depth, achieving plausible 3D transformations as demonstrated through applications.

We present an unsupervised approach for learning to estimate three dimensional (3D) facial structure from a single image while also predicting 3D viewpoint transformations that match a desired pose and facial geometry. We achieve this by inferring the depth of facial keypoints of an input image in an unsupervised manner, without using any form of ground-truth depth information. We show how it is possible to use these depths as intermediate computations within a new backpropable loss to predict the parameters of a 3D affine transformation matrix that maps inferred 3D keypoints of an input face to the corresponding 2D keypoints on a desired target facial geometry or pose. Our resulting approach, called DepthNets, can therefore be used to infer plausible 3D transformations from one face pose to another, allowing faces to be frontalized, transformed into 3D models or even warped to another pose and facial geometry. Lastly, we identify certain shortcomings with our formulation, and explore adversarial image translation techniques as a post-processing step to re-synthesize complete head shots for faces re-targeted to different poses or identities.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes