ProvNeRF: Modeling per Point Provenance in NeRFs as a Stochastic Field
This addresses a key bottleneck in NeRF reconstruction quality for 3D vision applications, though it is an incremental improvement over existing methods.
The paper tackles the problem of improving neural radiance fields (NeRFs) by modeling per-point provenance as a stochastic field to account for training camera pose distribution and triangulation quality, resulting in enhanced novel view synthesis and uncertainty estimation in sparse, unconstrained view settings.
Neural radiance fields (NeRFs) have gained popularity with multiple works showing promising results across various applications. However, to the best of our knowledge, existing works do not explicitly model the distribution of training camera poses, or consequently the triangulation quality, a key factor affecting reconstruction quality dating back to classical vision literature. We close this gap with ProvNeRF, an approach that models the \textbf{provenance} for each point -- i.e., the locations where it is likely visible -- of NeRFs as a stochastic field. We achieve this by extending implicit maximum likelihood estimation (IMLE) to functional space with an optimizable objective. We show that modeling per-point provenance during the NeRF optimization enriches the model with information on triangulation leading to improvements in novel view synthesis and uncertainty estimation under the challenging sparse, unconstrained view setting against competitive baselines.