AE-NeRF: Auto-Encoding Neural Radiance Fields for 3D-Aware Object Manipulation
This addresses the challenge of 3D object manipulation for computer vision and graphics applications, representing an incremental advancement in disentangling 3D attributes.
The paper tackles the problem of 3D-aware object manipulation by proposing AE-NeRF, an auto-encoder framework that extracts disentangled 3D attributes like shape, appearance, and camera pose from images and renders high-quality images, achieving improved performance over latest methods as demonstrated in experiments.
We propose a novel framework for 3D-aware object manipulation, called Auto-Encoding Neural Radiance Fields (AE-NeRF). Our model, which is formulated in an auto-encoder architecture, extracts disentangled 3D attributes such as 3D shape, appearance, and camera pose from an image, and a high-quality image is rendered from the attributes through disentangled generative Neural Radiance Fields (NeRF). To improve the disentanglement ability, we present two losses, global-local attribute consistency loss defined between input and output, and swapped-attribute classification loss. Since training such auto-encoding networks from scratch without ground-truth shape and appearance information is non-trivial, we present a stage-wise training scheme, which dramatically helps to boost the performance. We conduct experiments to demonstrate the effectiveness of the proposed model over the latest methods and provide extensive ablation studies.