CVDec 2, 2022

LatentSwap3D: Semantic Edits on 3D Image GANs

arXiv:2212.01381v29 citationsh-index: 20
AI Analysis

This work addresses the need for effective semantic editing tools in 3D generative models, which is incremental as it extends 2D GAN editing techniques to 3D contexts.

The paper tackles the problem of complex semantic image editing for 3D GANs, which had been underexplored, by proposing LatentSwap3D, a method that enables consistent and disentangled edits across multiple models and datasets, outperforming existing approaches both qualitatively and quantitatively.

3D GANs have the ability to generate latent codes for entire 3D volumes rather than only 2D images. These models offer desirable features like high-quality geometry and multi-view consistency, but, unlike their 2D counterparts, complex semantic image editing tasks for 3D GANs have only been partially explored. To address this problem, we propose LatentSwap3D, a semantic edit approach based on latent space discovery that can be used with any off-the-shelf 3D or 2D GAN model and on any dataset. LatentSwap3D relies on identifying the latent code dimensions corresponding to specific attributes by feature ranking using a random forest classifier. It then performs the edit by swapping the selected dimensions of the image being edited with the ones from an automatically selected reference image. Compared to other latent space control-based edit methods, which were mainly designed for 2D GANs, our method on 3D GANs provides remarkably consistent semantic edits in a disentangled manner and outperforms others both qualitatively and quantitatively. We show results on seven 3D GANs (pi-GAN, GIRAFFE, StyleSDF, MVCGAN, EG3D, StyleNeRF, and VolumeGAN) and on five datasets (FFHQ, AFHQ, Cats, MetFaces, and CompCars).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes