CVMMJun 2, 2022

DE-Net: Dynamic Text-guided Image Editing Adversarial Networks

arXiv:2206.01160v219 citationsh-index: 28Has Code
Originality Incremental advance
AI Analysis

This work addresses limitations in text-guided image editing for applications like content creation, but it is incremental as it builds on existing models.

The paper tackles the problems of over-editing or insufficient editing and inaccurate part distinction in text-guided image editing by proposing DE-Net with dynamic modules and adaptive convolution, achieving excellent performance in manipulating images more correctly and accurately.

Text-guided image editing models have shown remarkable results. However, there remain two problems. First, they employ fixed manipulation modules for various editing requirements (e.g., color changing, texture changing, content adding and removing), which results in over-editing or insufficient editing. Second, they do not clearly distinguish between text-required and text-irrelevant parts, which leads to inaccurate editing. To solve these limitations, we propose: (i) a Dynamic Editing Block (DEBlock) which composes different editing modules dynamically for various editing requirements. (ii) a Composition Predictor (Comp-Pred) which predicts the composition weights for DEBlock according to the inference on target texts and source images. (iii) a Dynamic text-adaptive Convolution Block (DCBlock) which queries source image features to distinguish text-required parts and text-irrelevant parts. Extensive experiments demonstrate that our DE-Net achieves excellent performance and manipulates source images more correctly and accurately. Code is available at \url{https://github.com/tobran/DE-Net}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes