CVGRJul 21, 2022

Injecting 3D Perception of Controllable NeRF-GAN into StyleGAN for Editable Portrait Image Synthesis

arXiv:2207.10257v229 citationsh-index: 31Has Code
Originality Incremental advance
AI Analysis

This work addresses the need for editable and consistent portrait synthesis in computer vision, offering a hybrid approach that combines strengths of 2D and 3D GANs.

The paper tackles the problem of multi-view inconsistency in 2D GANs and limited semantic editing in 3D-aware GANs for portrait generation by proposing SURF-GAN for unsupervised attribute control and injecting it into StyleGAN to enable explicit 3D pose control, achieving high-fidelity and compatibility with existing techniques.

Over the years, 2D GANs have achieved great successes in photorealistic portrait generation. However, they lack 3D understanding in the generation process, thus they suffer from multi-view inconsistency problem. To alleviate the issue, many 3D-aware GANs have been proposed and shown notable results, but 3D GANs struggle with editing semantic attributes. The controllability and interpretability of 3D GANs have not been much explored. In this work, we propose two solutions to overcome these weaknesses of 2D GANs and 3D-aware GANs. We first introduce a novel 3D-aware GAN, SURF-GAN, which is capable of discovering semantic attributes during training and controlling them in an unsupervised manner. After that, we inject the prior of SURF-GAN into StyleGAN to obtain a high-fidelity 3D-controllable generator. Unlike existing latent-based methods allowing implicit pose control, the proposed 3D-controllable StyleGAN enables explicit pose control over portrait generation. This distillation allows direct compatibility between 3D control and many StyleGAN-based techniques (e.g., inversion and stylization), and also brings an advantage in terms of computational resources. Our codes are available at https://github.com/jgkwak95/SURF-GAN.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes