Semantic Unfolding of StyleGAN Latent Space
This work addresses a specific issue in face image editing for GAN users, but it is incremental as it builds on existing StyleGAN techniques.
The paper tackled the problem of suboptimal facial attribute disentanglement in StyleGAN's latent space, which leads to flawed linear editing, and proposed a supervised method using normalizing flows to learn a proxy representation, resulting in a more efficient space for face image editing.
Generative adversarial networks (GANs) have proven to be surprisingly efficient for image editing by inverting and manipulating the latent code corresponding to an input real image. This editing property emerges from the disentangled nature of the latent space. In this paper, we identify that the facial attribute disentanglement is not optimal, thus facial editing relying on linear attribute separation is flawed. We thus propose to improve semantic disentanglement with supervision. Our method consists in learning a proxy latent representation using normalizing flows, and we show that this leads to a more efficient space for face image editing.