High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization
This work addresses the problem of generating realistic 3D content from a single image for applications in AI-generated media, representing a strong specific gain in 3D GAN inversion.
The paper tackles the challenge of high-fidelity 3D GAN inversion, which often suffers from geometry-texture trade-offs when overfitting to a single input image, by proposing a pipeline using pseudo-multi-view optimization to preserve details and synthesize photo-realistic novel views. It achieves advantageous reconstruction and novel view synthesis quality over state-of-the-art methods, even for out-of-distribution textures, enabling applications like 3D-aware editing.
We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views while preserving specific details of the input image. High-fidelity 3D GAN inversion is inherently challenging due to the geometry-texture trade-off in 3D inversion, where overfitting to a single view input image often damages the estimated geometry during the latent optimization. To solve this challenge, we propose a novel pipeline that builds on the pseudo-multi-view estimation with visibility analysis. We keep the original textures for the visible parts and utilize generative priors for the occluded parts. Extensive experiments show that our approach achieves advantageous reconstruction and novel view synthesis quality over state-of-the-art methods, even for images with out-of-distribution textures. The proposed pipeline also enables image attribute editing with the inverted latent code and 3D-aware texture modification. Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.