ARM: Appearance Reconstruction Model for Relightable 3D Generation
This addresses the challenge of faithful appearance generation in image-to-3D reconstruction, which is crucial for applications in graphics and VR, though it appears incremental as it builds on existing geometry methods.
The paper tackles the problem of generating realistic appearance in 3D reconstruction from sparse-view images by introducing ARM, which decouples geometry from appearance and processes appearance in UV texture space with a material prior, resulting in improved texture quality and outperforming existing methods quantitatively and qualitatively.
Recent image-to-3D reconstruction models have greatly advanced geometry generation, but they still struggle to faithfully generate realistic appearance. To address this, we introduce ARM, a novel method that reconstructs high-quality 3D meshes and realistic appearance from sparse-view images. The core of ARM lies in decoupling geometry from appearance, processing appearance within the UV texture space. Unlike previous methods, ARM improves texture quality by explicitly back-projecting measurements onto the texture map and processing them in a UV space module with a global receptive field. To resolve ambiguities between material and illumination in input images, ARM introduces a material prior that encodes semantic appearance information, enhancing the robustness of appearance decomposition. Trained on just 8 H100 GPUs, ARM outperforms existing methods both quantitatively and qualitatively.