Improving Neural Radiance Fields with Depth-aware Optimization for Novel View Synthesis
This addresses the issue of inaccurate 3D reconstruction in NeRF for novel view synthesis, particularly under sparse input conditions, which is incremental as it builds on existing NeRF methods with added constraints.
The paper tackles the problem of Neural Radiance Fields (NeRF) failing to obtain accurate 3D structures with sparse inputs, leading to poor novel view synthesis, by proposing SfMNeRF, which leverages self-supervised depth estimation to constrain geometry and improves both view synthesis and 3D-scene geometry, surpassing state-of-the-art approaches on two public datasets.
With dense inputs, Neural Radiance Fields (NeRF) is able to render photo-realistic novel views under static conditions. Although the synthesis quality is excellent, existing NeRF-based methods fail to obtain moderate three-dimensional (3D) structures. The novel view synthesis quality drops dramatically given sparse input due to the implicitly reconstructed inaccurate 3D-scene structure. We propose SfMNeRF, a method to better synthesize novel views as well as reconstruct the 3D-scene geometry. SfMNeRF leverages the knowledge from the self-supervised depth estimation methods to constrain the 3D-scene geometry during view synthesis training. Specifically, SfMNeRF employs the epipolar, photometric consistency, depth smoothness, and position-of-matches constraints to explicitly reconstruct the 3D-scene structure. Through these explicit constraints and the implicit constraint from NeRF, our method improves the view synthesis as well as the 3D-scene geometry performance of NeRF at the same time. In addition, SfMNeRF synthesizes novel sub-pixels in which the ground truth is obtained by image interpolation. This strategy enables SfMNeRF to include more samples to improve generalization performance. Experiments on two public datasets demonstrate that SfMNeRF surpasses state-of-the-art approaches. Code is available at https://github.com/XTU-PR-LAB/SfMNeRF