CVJul 8, 2024

Enhancing Neural Radiance Fields with Depth and Normal Completion Priors from Sparse Views

arXiv:2407.05666v11 citationsh-index: 3
Originality Incremental advance
AI Analysis

This addresses the challenge of accurate 3D scene reconstruction from limited images for applications like virtual reality and robotics, representing an incremental improvement over existing NeRF methods.

The paper tackles the problem of Neural Radiance Fields (NeRF) struggling with sparse input views by proposing CP_NeRF, a framework that integrates depth and normal completion priors to guide optimization, resulting in superior rendering of detailed indoor scenes compared to leading techniques.

Neural Radiance Fields (NeRF) are an advanced technology that creates highly realistic images by learning about scenes through a neural network model. However, NeRF often encounters issues when there are not enough images to work with, leading to problems in accurately rendering views. The main issue is that NeRF lacks sufficient structural details to guide the rendering process accurately. To address this, we proposed a Depth and Normal Dense Completion Priors for NeRF (CP\_NeRF) framework. This framework enhances view rendering by adding depth and normal dense completion priors to the NeRF optimization process. Before optimizing NeRF, we obtain sparse depth maps using the Structure from Motion (SfM) technique used to get camera poses. Based on the sparse depth maps and a normal estimator, we generate sparse normal maps for training a normal completion prior with precise standard deviations. During optimization, we apply depth and normal completion priors to transform sparse data into dense depth and normal maps with their standard deviations. We use these dense maps to guide ray sampling, assist distance sampling and construct a normal loss function for better training accuracy. To improve the rendering of NeRF's normal outputs, we incorporate an optical centre position embedder that helps synthesize more accurate normals through volume rendering. Additionally, we employ a normal patch matching technique to choose accurate rendered normal maps, ensuring more precise supervision for the model. Our method is superior to leading techniques in rendering detailed indoor scenes, even with limited input views.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes