In-N-Out: Faithful 3D GAN Inversion with Volumetric Decomposition for Face Editing
This addresses a specific bottleneck in 3D face editing for applications requiring accurate reconstruction of diverse or occluded faces.
The paper tackles the problem of 3D GAN inversion failing on out-of-distribution faces by modeling OOD objects with a separate neural radiance field, achieving improved reconstruction fidelity and editability on challenging real images and videos.
3D-aware GANs offer new capabilities for view synthesis while preserving the editing functionalities of their 2D counterparts. GAN inversion is a crucial step that seeks the latent code to reconstruct input images or videos, subsequently enabling diverse editing tasks through manipulation of this latent code. However, a model pre-trained on a particular dataset (e.g., FFHQ) often has difficulty reconstructing images with out-of-distribution (OOD) objects such as faces with heavy make-up or occluding objects. We address this issue by explicitly modeling OOD objects from the input in 3D-aware GANs. Our core idea is to represent the image using two individual neural radiance fields: one for the in-distribution content and the other for the out-of-distribution object. The final reconstruction is achieved by optimizing the composition of these two radiance fields with carefully designed regularization. We demonstrate that our explicit decomposition alleviates the inherent trade-off between reconstruction fidelity and editability. We evaluate reconstruction accuracy and editability of our method on challenging real face images and videos and showcase favorable results against other baselines.