CVJan 22, 2024

HG3-NeRF: Hierarchical Geometric, Semantic, and Photometric Guided Neural Radiance Fields for Sparse View Inputs

arXiv:2401.11711v16.57 citationsh-index: 3

Originality Highly original

AI Analysis

This addresses a key limitation in NeRF for novel view synthesis, enabling better performance with fewer input views, which is incremental but important for practical applications like 3D reconstruction.

The paper tackles the problem of Neural Radiance Fields (NeRF) performing poorly with sparse view inputs by introducing HG3-NeRF, which uses hierarchical geometric, semantic, and photometric guidance to improve consistency; it outperforms state-of-the-art methods on standard benchmarks for high-fidelity synthesis.

Neural Radiance Fields (NeRF) have garnered considerable attention as a paradigm for novel view synthesis by learning scene representations from discrete observations. Nevertheless, NeRF exhibit pronounced performance degradation when confronted with sparse view inputs, consequently curtailing its further applicability. In this work, we introduce Hierarchical Geometric, Semantic, and Photometric Guided NeRF (HG3-NeRF), a novel methodology that can address the aforementioned limitation and enhance consistency of geometry, semantic content, and appearance across different views. We propose Hierarchical Geometric Guidance (HGG) to incorporate the attachment of Structure from Motion (SfM), namely sparse depth prior, into the scene representations. Different from direct depth supervision, HGG samples volume points from local-to-global geometric regions, mitigating the misalignment caused by inherent bias in the depth prior. Furthermore, we draw inspiration from notable variations in semantic consistency observed across images of different resolutions and propose Hierarchical Semantic Guidance (HSG) to learn the coarse-to-fine semantic content, which corresponds to the coarse-to-fine scene representations. Experimental results demonstrate that HG3-NeRF can outperform other state-of-the-art methods on different standard benchmarks and achieve high-fidelity synthesis results for sparse view inputs.

View on arXiv PDF

Similar