CVDec 1, 2021

HyperInverter: Improving StyleGAN Inversion via Hypernetwork

arXiv:2112.00719v2129 citations
Originality Incremental advance
AI Analysis

This work addresses the bottleneck in GAN inversion for image editing by providing a solution that balances high reconstruction, editability, and speed, which is incremental but practical for applications in computer vision.

The authors tackled the problem of GAN inversion for real-world image manipulation by introducing a two-phase method that improves reconstruction quality and editability while maintaining fast inference, achieving superior results on two challenging datasets.

Real-world image manipulation has achieved fantastic progress in recent years as a result of the exploration and utilization of GAN latent spaces. GAN inversion is the first step in this pipeline, which aims to map the real image to the latent code faithfully. Unfortunately, the majority of existing GAN inversion methods fail to meet at least one of the three requirements listed below: high reconstruction quality, editability, and fast inference. We present a novel two-phase strategy in this research that fits all requirements at the same time. In the first phase, we train an encoder to map the input image to StyleGAN2 $\mathcal{W}$-space, which was proven to have excellent editability but lower reconstruction quality. In the second phase, we supplement the reconstruction ability in the initial phase by leveraging a series of hypernetworks to recover the missing information during inversion. These two steps complement each other to yield high reconstruction quality thanks to the hypernetwork branch and excellent editability due to the inversion done in the $\mathcal{W}$-space. Our method is entirely encoder-based, resulting in extremely fast inference. Extensive experiments on two challenging datasets demonstrate the superiority of our method.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes