MVPSNet: Fast Generalizable Multi-view Photometric Stereo
This work addresses the challenge of 3D reconstruction in textureless regions for computer vision applications, offering a significant speed improvement over existing methods.
The authors tackled the problem of multi-view photometric stereo (MVPS) by introducing MVPSNet, a fast and generalizable method that uses light aggregated feature maps (LAFM) for geometric feature extraction, achieving similar reconstruction results to the state-of-the-art PS-NeRF while being 411 times faster (105 seconds vs. 12 hours).
We propose a fast and generalizable solution to Multi-view Photometric Stereo (MVPS), called MVPSNet. The key to our approach is a feature extraction network that effectively combines images from the same view captured under multiple lighting conditions to extract geometric features from shading cues for stereo matching. We demonstrate these features, termed `Light Aggregated Feature Maps' (LAFM), are effective for feature matching even in textureless regions, where traditional multi-view stereo methods fail. Our method produces similar reconstruction results to PS-NeRF, a state-of-the-art MVPS method that optimizes a neural network per-scene, while being 411$\times$ faster (105 seconds vs. 12 hours) in inference. Additionally, we introduce a new synthetic dataset for MVPS, sMVPS, which is shown to be effective to train a generalizable MVPS method.