Neural Multi-View Self-Calibrated Photometric Stereo without Photometric Stereo Cues
This addresses the challenge of 3D reconstruction for objects with complex geometry and reflectance in computer vision, though it is incremental as it builds on neural inverse rendering methods.
The paper tackles the problem of jointly reconstructing geometry, reflectance, and lighting from multi-view images under varying lighting without requiring light calibration or intermediate cues, achieving state-of-the-art accuracy in shape and lighting estimation.
We propose a neural inverse rendering approach that jointly reconstructs geometry, spatially varying reflectance, and lighting conditions from multi-view images captured under varying directional lighting. Unlike prior multi-view photometric stereo methods that require light calibration or intermediate cues such as per-view normal maps, our method jointly optimizes all scene parameters from raw images in a single stage. We represent both geometry and reflectance as neural implicit fields and apply shadow-aware volume rendering. A spatial network first predicts the signed distance and a reflectance latent code for each scene point. A reflectance network then estimates reflectance values conditioned on the latent code and angularly encoded surface normal, view, and light directions. The proposed method outperforms state-of-the-art normal-guided approaches in shape and lighting estimation accuracy, generalizes to view-unaligned multi-light images, and handles objects with challenging geometry and reflectance.