CVAILGApr 15, 2024

Taming Latent Diffusion Model for Neural Radiance Field Inpainting

arXiv:2404.09995v220 citationsh-index: 25ECCV
Originality Incremental advance
AI Analysis

This addresses a specific challenge in 3D reconstruction and editing for computer vision applications, representing an incremental improvement over prior methods.

The paper tackled the problem of synthesizing reasonable geometry in completely uncovered regions for Neural Radiance Field (NeRF) inpainting, achieving state-of-the-art results on various real-world scenes.

Neural Radiance Field (NeRF) is a representation for 3D reconstruction from multi-view images. Despite some recent work showing preliminary success in editing a reconstructed NeRF with diffusion prior, they remain struggling to synthesize reasonable geometry in completely uncovered regions. One major reason is the high diversity of synthetic contents from the diffusion model, which hinders the radiance field from converging to a crisp and deterministic geometry. Moreover, applying latent diffusion models on real data often yields a textural shift incoherent to the image condition due to auto-encoding errors. These two problems are further reinforced with the use of pixel-distance losses. To address these issues, we propose tempering the diffusion model's stochasticity with per-scene customization and mitigating the textural shift with masked adversarial training. During the analyses, we also found the commonly used pixel and perceptual losses are harmful in the NeRF inpainting task. Through rigorous experiments, our framework yields state-of-the-art NeRF inpainting results on various real-world scenes. Project page: https://hubert0527.github.io/MALD-NeRF

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes