Bayesian Monocular Depth Refinement via Neural Radiance Fields
This work addresses the need for more accurate depth maps in applications like autonomous navigation and extended reality, though it appears incremental as it builds on existing monocular and NeRF methods.
The paper tackles the problem of monocular depth estimation producing smooth depth maps lacking fine geometric detail by proposing MDENeRF, an iterative framework that refines depth estimates using Neural Radiance Fields with Bayesian fusion, resulting in superior performance on the SUN RGB-D dataset.
Monocular depth estimation has applications in many fields, such as autonomous navigation and extended reality, making it an essential computer vision task. However, current methods often produce smooth depth maps that lack the fine geometric detail needed for accurate scene understanding. We propose MDENeRF, an iterative framework that refines monocular depth estimates using depth information from Neural Radiance Fields (NeRFs). MDENeRF consists of three components: (1) an initial monocular estimate for global structure, (2) a NeRF trained on perturbed viewpoints, with per-pixel uncertainty, and (3) Bayesian fusion of the noisy monocular and NeRF depths. We derive NeRF uncertainty from the volume rendering process to iteratively inject high-frequency fine details. Meanwhile, our monocular prior maintains global structure. We demonstrate superior performance on key metrics and experiments using indoor scenes from the SUN RGB-D dataset.