Make It So: Steering StyleGAN for Any Image Inversion and Editing
This addresses the problem of maintaining editing capabilities in GAN inversion for out-of-domain images, which is incremental as it builds on existing StyleGAN methods.
The paper tackles the challenge of accurately mapping real-world images to latent variables in StyleGAN for editing by proposing Make It So, a GAN inversion method that operates in the noise space, resulting in a fivefold improvement in inversion accuracy and tenfold better edit quality compared to the state-of-the-art.
StyleGAN's disentangled style representation enables powerful image editing by manipulating the latent variables, but accurately mapping real-world images to their latent variables (GAN inversion) remains a challenge. Existing GAN inversion methods struggle to maintain editing directions and produce realistic results. To address these limitations, we propose Make It So, a novel GAN inversion method that operates in the $\mathcal{Z}$ (noise) space rather than the typical $\mathcal{W}$ (latent style) space. Make It So preserves editing capabilities, even for out-of-domain images. This is a crucial property that was overlooked in prior methods. Our quantitative evaluations demonstrate that Make It So outperforms the state-of-the-art method PTI~\cite{roich2021pivotal} by a factor of five in inversion accuracy and achieves ten times better edit quality for complex indoor scenes.