Unsupervised Facial Geometry Learning for Sketch to Photo Synthesis
This work addresses a critical need in law enforcement and digital entertainment for generating realistic photos from sketches, though it is incremental as it builds on existing unsupervised image-to-image translation techniques.
The paper tackles the problem of synthesizing photo-realistic images from face sketches without paired training data by introducing a perceptual discriminator to learn facial geometry, resulting in significant improvements in both quality and recognition rates of the synthesized photos.
Face sketch-photo synthesis is a critical application in law enforcement and digital entertainment industry where the goal is to learn the mapping between a face sketch image and its corresponding photo-realistic image. However, the limited number of paired sketch-photo training data usually prevents the current frameworks to learn a robust mapping between the geometry of sketches and their matching photo-realistic images. Consequently, in this work, we present an approach for learning to synthesize a photo-realistic image from a face sketch in an unsupervised fashion. In contrast to current unsupervised image-to-image translation techniques, our framework leverages a novel perceptual discriminator to learn the geometry of human face. Learning facial prior information empowers the network to remove the geometrical artifacts in the face sketch. We demonstrate that a simultaneous optimization of the face photo generator network, employing the proposed perceptual discriminator in combination with a texture-wise discriminator, results in a significant improvement in quality and recognition rate of the synthesized photos. We evaluate the proposed network by conducting extensive experiments on multiple baseline sketch-photo datasets.