CVAIMar 19, 2024

Depth-guided NeRF Training via Earth Mover's Distance

arXiv:2403.13206v23 citationsECCV
Originality Incremental advance
AI Analysis

This work addresses geometry disambiguation in NeRFs for 3D reconstruction, offering an incremental improvement over existing depth supervision methods.

The paper tackles the problem of ambiguous geometry in Neural Radiance Fields (NeRFs) by proposing a depth-guided training method that uses Earth Mover's Distance to handle uncertainty in depth priors, achieving large-margin improvements on standard depth metrics while maintaining photometric performance.

Neural Radiance Fields (NeRFs) are trained to minimize the rendering loss of predicted viewpoints. However, the photometric loss often does not provide enough information to disambiguate between different possible geometries yielding the same image. Previous work has thus incorporated depth supervision during NeRF training, leveraging dense predictions from pre-trained depth networks as pseudo-ground truth. While these depth priors are assumed to be perfect once filtered for noise, in practice, their accuracy is more challenging to capture. This work proposes a novel approach to uncertainty in depth priors for NeRF supervision. Instead of using custom-trained depth or uncertainty priors, we use off-the-shelf pretrained diffusion models to predict depth and capture uncertainty during the denoising process. Because we know that depth priors are prone to errors, we propose to supervise the ray termination distance distribution with Earth Mover's Distance instead of enforcing the rendered depth to replicate the depth prior exactly through L2-loss. Our depth-guided NeRF outperforms all baselines on standard depth metrics by a large margin while maintaining performance on photometric measures.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes