Semantically Decomposing the Latent Spaces of Generative Adversarial Networks
This addresses the challenge of fine-grained control in image generation for applications like face synthesis, though it is incremental as it builds on existing GAN methods.
The paper tackles the problem of generating diverse images of the same subject in generative adversarial networks by proposing an algorithm that learns separate latent codes for identities and observations, enabling control over subject identity and image attributes like lighting and pose. Experiments with human judges and a face verification system show the algorithm can generate convincing, identity-matched photographs.
We propose a new algorithm for training generative adversarial networks that jointly learns latent codes for both identities (e.g. individual humans) and observations (e.g. specific photographs). By fixing the identity portion of the latent codes, we can generate diverse images of the same subject, and by fixing the observation portion, we can traverse the manifold of subjects while maintaining contingent aspects such as lighting and pose. Our algorithm features a pairwise training scheme in which each sample from the generator consists of two images with a common identity code. Corresponding samples from the real dataset consist of two distinct photographs of the same subject. In order to fool the discriminator, the generator must produce pairs that are photorealistic, distinct, and appear to depict the same individual. We augment both the DCGAN and BEGAN approaches with Siamese discriminators to facilitate pairwise training. Experiments with human judges and an off-the-shelf face verification system demonstrate our algorithm's ability to generate convincing, identity-matched photographs.