Diverse Inpainting and Editing with GAN Inversion
This addresses the challenge of high-fidelity reconstruction and editability in image inpainting for computer vision applications, though it appears incremental as it builds on existing GAN inversion methods.
The paper tackled the problem of inverting erased images into StyleGAN's latent space for realistic inpainting and editing, achieving diverse inpaintings by augmenting latent codes and showing significant improvements in qualitative metrics and visual comparisons.
Recent inversion methods have shown that real images can be inverted into StyleGAN's latent space and numerous edits can be achieved on those images thanks to the semantically rich feature representations of well-trained GAN models. However, extensive research has also shown that image inversion is challenging due to the trade-off between high-fidelity reconstruction and editability. In this paper, we tackle an even more difficult task, inverting erased images into GAN's latent space for realistic inpaintings and editings. Furthermore, by augmenting inverted latent codes with different latent samples, we achieve diverse inpaintings. Specifically, we propose to learn an encoder and mixing network to combine encoded features from erased images with StyleGAN's mapped features from random samples. To encourage the mixing network to utilize both inputs, we train the networks with generated data via a novel set-up. We also utilize higher-rate features to prevent color inconsistencies between the inpainted and unerased parts. We run extensive experiments and compare our method with state-of-the-art inversion and inpainting methods. Qualitative metrics and visual comparisons show significant improvements.