CVJul 5, 2023

DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models

arXiv:2307.02421v2226 citationsh-index: 36Has Code
AI Analysis

This addresses the need for more controllable editing in text-to-image models for users in creative and design fields, representing a novel method for a known bottleneck.

The paper tackles the problem of precise image editing in diffusion models by proposing DragonDiffusion, a method that enables drag-style manipulation without fine-tuning, achieving object moving, resizing, appearance replacement, and content dragging using only image-based signals.

Despite the ability of existing large-scale text-to-image (T2I) models to generate high-quality images from detailed textual descriptions, they often lack the ability to precisely edit the generated or real images. In this paper, we propose a novel image editing method, DragonDiffusion, enabling Drag-style manipulation on Diffusion models. Specifically, we construct classifier guidance based on the strong correspondence of intermediate features in the diffusion model. It can transform the editing signals into gradients via feature correspondence loss to modify the intermediate representation of the diffusion model. Based on this guidance strategy, we also build a multi-scale guidance to consider both semantic and geometric alignment. Moreover, a cross-branch self-attention is added to maintain the consistency between the original image and the editing result. Our method, through an efficient design, achieves various editing modes for the generated or real images, such as object moving, object resizing, object appearance replacement, and content dragging. It is worth noting that all editing and content preservation signals come from the image itself, and the model does not require fine-tuning or additional modules. Our source code will be available at https://github.com/MC-E/DragonDiffusion.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes