Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing
This addresses the challenge of real-time image editing for users of GAN-based tools, offering a more efficient and accurate method compared to existing optimization-based approaches.
The paper tackles the problem of slow and inaccurate real image editing with GANs by proposing StyleMapGAN, which uses a spatially variant latent space to enable more accurate encoder-based embedding, achieving significant outperformance over state-of-the-art models in tasks like local editing and image interpolation.
Generative adversarial networks (GANs) synthesize realistic images from random latent vectors. Although manipulating the latent vectors controls the synthesized outputs, editing real images with GANs suffers from i) time-consuming optimization for projecting real images to the latent vectors, ii) or inaccurate embedding through an encoder. We propose StyleMapGAN: the intermediate latent space has spatial dimensions, and a spatially variant modulation replaces AdaIN. It makes the embedding through an encoder more accurate than existing optimization-based methods while maintaining the properties of GANs. Experimental results demonstrate that our method significantly outperforms state-of-the-art models in various image manipulation tasks such as local editing and image interpolation. Last but not least, conventional editing methods on GANs are still valid on our StyleMapGAN. Source code is available at https://github.com/naver-ai/StyleMapGAN.