Synthesizing Normalized Faces from Facial Identity Features
This method enables applications like facial attribute analysis and 3-D avatar creation, but it is incremental as it builds on existing facial-recognition and synthesis techniques.
The paper tackles the problem of synthesizing frontal, neutral-expression face images from input photographs by using features from a facial-recognition network that are invariant to lighting, pose, and expression, and it achieves this by training a decoder to predict landmarks and textures independently and combine them with differentiable warping.
We present a method for synthesizing a frontal, neutral-expression image of a person's face given an input face photograph. This is achieved by learning to generate facial landmarks and textures from features extracted from a facial-recognition network. Unlike previous approaches, our encoding feature vector is largely invariant to lighting, pose, and facial expression. Exploiting this invariance, we train our decoder network using only frontal, neutral-expression photographs. Since these photographs are well aligned, we can decompose them into a sparse set of landmark points and aligned texture maps. The decoder then predicts landmarks and textures independently and combines them using a differentiable image warping operation. The resulting images can be used for a number of applications, such as analyzing facial attributes, exposure and white balance adjustment, or creating a 3-D avatar.