DiSR-NeRF: Diffusion-Guided View-Consistent Super-Resolution NeRF
This addresses the challenge of generating high-resolution, view-consistent 3D scenes from low-resolution inputs for applications in computer vision and graphics, representing an incremental improvement by integrating diffusion models with NeRF.
The paper tackles the problem of view inconsistency in super-resolution for Neural Radiance Fields (NeRF) without high-resolution reference images, achieving better results than existing works on synthetic and real-world datasets.
We present DiSR-NeRF, a diffusion-guided framework for view-consistent super-resolution (SR) NeRF. Unlike prior works, we circumvent the requirement for high-resolution (HR) reference images by leveraging existing powerful 2D super-resolution models. Nonetheless, independent SR 2D images are often inconsistent across different views. We thus propose Iterative 3D Synchronization (I3DS) to mitigate the inconsistency problem via the inherent multi-view consistency property of NeRF. Specifically, our I3DS alternates between upscaling low-resolution (LR) rendered images with diffusion models, and updating the underlying 3D representation with standard NeRF training. We further introduce Renoised Score Distillation (RSD), a novel score-distillation objective for 2D image resolution. Our RSD combines features from ancestral sampling and Score Distillation Sampling (SDS) to generate sharp images that are also LR-consistent. Qualitative and quantitative results on both synthetic and real-world datasets demonstrate that our DiSR-NeRF can achieve better results on NeRF super-resolution compared with existing works. Code and video results available at the project website.