CVMar 29, 2023

MDP: A Generalized Framework for Text-Guided Image Editing by Manipulating the Diffusion Path

arXiv:2303.16765v225 citationsh-index: 70
AI Analysis

This work provides a generalized framework for improving image editing in diffusion models, though it appears incremental as it builds on existing methods.

The authors tackled the problem of text-guided image editing by analyzing diffusion network equations to propose the MDP framework, which identifies five manipulation types and demonstrates that a specific configuration manipulating predicted noise achieves higher-quality edits than prior methods.

Image generation using diffusion can be controlled in multiple ways. In this paper, we systematically analyze the equations of modern generative diffusion networks to propose a framework, called MDP, that explains the design space of suitable manipulations. We identify 5 different manipulations, including intermediate latent, conditional embedding, cross attention maps, guidance, and predicted noise. We analyze the corresponding parameters of these manipulations and the manipulation schedule. We show that some previous editing methods fit nicely into our framework. Particularly, we identified one specific configuration as a new type of control by manipulating the predicted noise, which can perform higher-quality edits than previous work for a variety of local and global edits.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes