DiffusioNeRF: Regularizing Neural Radiance Fields with Denoising Diffusion Models
This addresses the under-constrained nature of NeRFs for novel view synthesis, particularly in low-data scenarios, though it is an incremental improvement by integrating diffusion models as a regularization technique.
The paper tackles the problem of artifacts in Neural Radiance Fields (NeRFs) when trained with few input views by learning a prior over scene geometry and color using a denoising diffusion model, resulting in improved quality in reconstructed geometry and novel view synthesis on datasets like LLFF and DTU.
Under good conditions, Neural Radiance Fields (NeRFs) have shown impressive results on novel view synthesis tasks. NeRFs learn a scene's color and density fields by minimizing the photometric discrepancy between training views and differentiable renderings of the scene. Once trained from a sufficient set of views, NeRFs can generate novel views from arbitrary camera positions. However, the scene geometry and color fields are severely under-constrained, which can lead to artifacts, especially when trained with few input views. To alleviate this problem we learn a prior over scene geometry and color, using a denoising diffusion model (DDM). Our DDM is trained on RGBD patches of the synthetic Hypersim dataset and can be used to predict the gradient of the logarithm of a joint probability distribution of color and depth patches. We show that, these gradients of logarithms of RGBD patch priors serve to regularize geometry and color of a scene. During NeRF training, random RGBD patches are rendered and the estimated gradient of the log-likelihood is backpropagated to the color and density fields. Evaluations on LLFF, the most relevant dataset, show that our learned prior achieves improved quality in the reconstructed geometry and improved generalization to novel views. Evaluations on DTU show improved reconstruction quality among NeRF methods.