Exploiting GAN Internal Capacity for High-Quality Reconstruction of Natural Images
This work addresses image reconstruction and manipulation for computer vision applications, offering an incremental improvement over existing inversion methods.
The paper tackles the problem of reconstructing natural images with high visual fidelity by exploiting intermediate layers in GAN generators, rather than just the latent space, and shows this leads to increased capacity for representing and manipulating images.
Generative Adversarial Networks (GAN) have demonstrated impressive results in modeling the distribution of natural images, learning latent representations that capture semantic variations in an unsupervised basis. Beyond the generation of novel samples, it is of special interest to exploit the ability of the GAN generator to model the natural image manifold and hence generate credible changes when manipulating images. However, this line of work is conditioned by the quality of the reconstruction. Until now, only inversion to the latent space has been considered, we propose to exploit the representation in intermediate layers of the generator, and we show that this leads to increased capacity. In particular, we observe that the representation after the first dense layer, present in all state-of-the-art GAN models, is expressive enough to represent natural images with high visual fidelity. It is possible to interpolate around these images obtaining a sequence of new plausible synthetic images that cannot be generated from the latent space. Finally, as an example of potential applications that arise from this inversion mechanism, we show preliminary results in exploiting the learned representation in the attention map of the generator to obtain an unsupervised segmentation of natural images.