UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction
This work addresses a key bottleneck in 3D reconstruction for computer vision applications by eliminating the need for labor-intensive mask annotations, though it builds incrementally on existing neural implicit and radiance field paradigms.
The paper tackles the problem of reconstructing surfaces from multi-view images without requiring accurate object masks as supervision, achieving performance on par with mask-required methods and outperforming NeRF in reconstruction quality on datasets like DTU and BlendedMVS.
Neural implicit 3D representations have emerged as a powerful paradigm for reconstructing surfaces from multi-view images and synthesizing novel views. Unfortunately, existing methods such as DVR or IDR require accurate per-pixel object masks as supervision. At the same time, neural radiance fields have revolutionized novel view synthesis. However, NeRF's estimated volume density does not admit accurate surface reconstruction. Our key insight is that implicit surface models and radiance fields can be formulated in a unified way, enabling both surface and volume rendering using the same model. This unified perspective enables novel, more efficient sampling procedures and the ability to reconstruct accurate surfaces without input masks. We compare our method on the DTU, BlendedMVS, and a synthetic indoor dataset. Our experiments demonstrate that we outperform NeRF in terms of reconstruction quality while performing on par with IDR without requiring masks.