CVGRNov 27, 2023

GaussianEditor: Editing 3D Gaussians Delicately with Text Instructions

arXiv:2311.16037v2194 citationsh-index: 66
Originality Incremental advance
AI Analysis

This addresses the problem of localized 3D scene editing for applications in computer graphics and vision, offering an incremental improvement over existing diffusion-based methods.

The paper tackles the challenge of performing delicate, localized editing of 3D scenes using text instructions by proposing GaussianEditor, a framework based on 3D Gaussian splatting that achieves more precise editing and faster training speeds, specifically within 20 minutes on a single V100 GPU, more than twice as fast as previous methods like Instruct-NeRF2NeRF.

Recently, impressive results have been achieved in 3D scene editing with text instructions based on a 2D diffusion model. However, current diffusion models primarily generate images by predicting noise in the latent space, and the editing is usually applied to the whole image, which makes it challenging to perform delicate, especially localized, editing for 3D scenes. Inspired by recent 3D Gaussian splatting, we propose a systematic framework, named GaussianEditor, to edit 3D scenes delicately via 3D Gaussians with text instructions. Benefiting from the explicit property of 3D Gaussians, we design a series of techniques to achieve delicate editing. Specifically, we first extract the region of interest (RoI) corresponding to the text instruction, aligning it to 3D Gaussians. The Gaussian RoI is further used to control the editing process. Our framework can achieve more delicate and precise editing of 3D scenes than previous methods while enjoying much faster training speed, i.e. within 20 minutes on a single V100 GPU, more than twice as fast as Instruct-NeRF2NeRF (45 minutes -- 2 hours).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes