Blended-NeRF: Zero-Shot Object Generation and Blending in Existing Neural Radiance Fields
This work addresses the problem of 3D scene editing for users in computer graphics and vision, offering a novel method for object generation and blending in NeRF scenes, though it builds on existing techniques.
The paper tackles the challenge of editing specific objects or regions in 3D scenes represented by Neural Radiance Fields (NeRF) by introducing Blended-NeRF, a framework that uses text prompts and a 3D region-of-interest box to generate and blend new objects realistically, achieving multi-view consistent results with flexibility and diversity compared to baselines.
Editing a local region or a specific object in a 3D scene represented by a NeRF or consistently blending a new realistic object into the scene is challenging, mainly due to the implicit nature of the scene representation. We present Blended-NeRF, a robust and flexible framework for editing a specific region of interest in an existing NeRF scene, based on text prompts, along with a 3D ROI box. Our method leverages a pretrained language-image model to steer the synthesis towards a user-provided text prompt, along with a 3D MLP model initialized on an existing NeRF scene to generate the object and blend it into a specified region in the original scene. We allow local editing by localizing a 3D ROI box in the input scene, and blend the content synthesized inside the ROI with the existing scene using a novel volumetric blending technique. To obtain natural looking and view-consistent results, we leverage existing and new geometric priors and 3D augmentations for improving the visual fidelity of the final result. We test our framework both qualitatively and quantitatively on a variety of real 3D scenes and text prompts, demonstrating realistic multi-view consistent results with much flexibility and diversity compared to the baselines. Finally, we show the applicability of our framework for several 3D editing applications, including adding new objects to a scene, removing/replacing/altering existing objects, and texture conversion.