CVMay 23, 2024

TIGER: Text-Instructed 3D Gaussian Retrieval and Coherent Editing

arXiv:2405.14455v213 citationsh-index: 9
Originality Incremental advance
AI Analysis

This addresses the need for coherent object editing in 3D scenes for applications in computer vision and graphics, representing an incremental improvement over existing techniques.

The paper tackles the problem of editing objects in 3D Gaussian Splatting scenes by proposing TIGER, which uses a bottom-up language aggregation strategy for retrieval and a Coherent Score Distillation method for editing, resulting in more consistent and realistic edits than prior work.

Editing objects within a scene is a critical functionality required across a broad spectrum of applications in computer vision and graphics. As 3D Gaussian Splatting (3DGS) emerges as a frontier in scene representation, the effective modification of 3D Gaussian scenes has become increasingly vital. This process entails accurately retrieve the target objects and subsequently performing modifications based on instructions. Though available in pieces, existing techniques mainly embed sparse semantics into Gaussians for retrieval, and rely on an iterative dataset update paradigm for editing, leading to over-smoothing or inconsistency issues. To this end, this paper proposes a systematic approach, namely TIGER, for coherent text-instructed 3D Gaussian retrieval and editing. In contrast to the top-down language grounding approach for 3D Gaussians, we adopt a bottom-up language aggregation strategy to generate a denser language embedded 3D Gaussians that supports open-vocabulary retrieval. To overcome the over-smoothing and inconsistency issues in editing, we propose a Coherent Score Distillation (CSD) that aggregates a 2D image editing diffusion model and a multi-view diffusion model for score distillation, producing multi-view consistent editing with much finer details. In various experiments, we demonstrate that our TIGER is able to accomplish more consistent and realistic edits than prior work.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes