A Compact and Semantic Latent Space for Disentangled and Controllable Image Editing
This addresses the challenge of independent attribute control in image editing for generative model applications, representing an incremental improvement over existing methods.
The paper tackles the problem of disentangled and controllable image editing in GANs by proposing an auto-encoder that reorganizes StyleGAN's latent space into decorrelated axes for independent attribute editing, achieving greater disentanglement than competing methods while maintaining image fidelity.
Recent advances in the field of generative models and in particular generative adversarial networks (GANs) have lead to substantial progress for controlled image editing, especially compared with the pre-deep learning era. Despite their powerful ability to apply realistic modifications to an image, these methods often lack properties like disentanglement (the capacity to edit attributes independently). In this paper, we propose an auto-encoder which re-organizes the latent space of StyleGAN, so that each attribute which we wish to edit corresponds to an axis of the new latent space, and furthermore that the latent axes are decorrelated, encouraging disentanglement. We work in a compressed version of the latent space, using Principal Component Analysis, meaning that the parameter complexity of our autoencoder is reduced, leading to short training times ($\sim$ 45 mins). Qualitative and quantitative results demonstrate the editing capabilities of our approach, with greater disentanglement than competing methods, while maintaining fidelity to the original image with respect to identity. Our autoencoder architecture simple and straightforward, facilitating implementation.