CVMar 21, 2025

DCEdit: Dual-Level Controlled Image Editing via Precisely Localized Semantics

Yihan Hu, Jianing Peng, Yiheng Lin, Ting Liu, Xiaochao Qu, Luoqi Liu, Yao Zhao, Yunchao Wei

arXiv:2503.16795v114.411 citationsh-index: 18

Originality Incremental advance

AI Analysis

This work addresses the problem of precise semantic editing in images for users of diffusion-based models, representing an incremental improvement over existing methods.

The paper tackles the challenge of precisely locating and editing target semantics in text-guided image editing by introducing a Precise Semantic Localization strategy and Dual-Level Control mechanism, achieving superior performance in preserving background and providing accurate edits on benchmarks like PIE-Bench and RW-800.

This paper presents a novel approach to improving text-guided image editing using diffusion-based models. Text-guided image editing task poses key challenge of precisly locate and edit the target semantic, and previous methods fall shorts in this aspect. Our method introduces a Precise Semantic Localization strategy that leverages visual and textual self-attention to enhance the cross-attention map, which can serve as a regional cues to improve editing performance. Then we propose a Dual-Level Control mechanism for incorporating regional cues at both feature and latent levels, offering fine-grained control for more precise edits. To fully compare our methods with other DiT-based approaches, we construct the RW-800 benchmark, featuring high resolution images, long descriptive texts, real-world images, and a new text editing task. Experimental results on the popular PIE-Bench and RW-800 benchmarks demonstrate the superior performance of our approach in preserving background and providing accurate edits.

View on arXiv PDF

Similar