CVAug 15, 2025

CoreEditor: Consistent 3D Editing via Correspondence-constrained Diffusion

arXiv:2508.11603v12 citationsh-index: 19IEEE Trans Vis Comput Graph
Originality Highly original
AI Analysis

This addresses the issue of blurry and insufficient edits in 3D scene modification for users in computer graphics and AI, representing a novel method for a known bottleneck.

The paper tackled the problem of inconsistent cross-view edits in text-driven 3D editing by introducing CoreEditor, which uses a correspondence-constrained attention mechanism to enforce multi-view consistency, resulting in higher-quality edits with sharper details and outperforming prior methods.

Text-driven 3D editing seeks to modify 3D scenes according to textual descriptions, and most existing approaches tackle this by adapting pre-trained 2D image editors to multi-view inputs. However, without explicit control over multi-view information exchange, they often fail to maintain cross-view consistency, leading to insufficient edits and blurry details. We introduce CoreEditor, a novel framework for consistent text-to-3D editing. The key innovation is a correspondence-constrained attention mechanism that enforces precise interactions between pixels expected to remain consistent throughout the diffusion denoising process. Beyond relying solely on geometric alignment, we further incorporate semantic similarity estimated during denoising, enabling more reliable correspondence modeling and robust multi-view editing. In addition, we design a selective editing pipeline that allows users to choose preferred results from multiple candidates, offering greater flexibility and user control. Extensive experiments show that CoreEditor produces high-quality, 3D-consistent edits with sharper details, significantly outperforming prior methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes