CVAICLJun 16, 2023

MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing

Microsoft
arXiv:2306.10012v3640 citationsh-index: 42
Originality Incremental advance
AI Analysis

This addresses the need for high-quality training data in image editing for both personal and professional users, though it is incremental as it builds on existing datasets and methods.

The authors tackled the problem of noisy training data in text-guided image editing by introducing MagicBrush, a manually annotated dataset of over 10K triplets, which improved model performance in human evaluations.

Text-guided image editing is widely needed in daily life, ranging from personal use to professional applications such as Photoshop. However, existing methods are either zero-shot or trained on an automatically synthesized dataset, which contains a high volume of noise. Thus, they still require lots of manual tuning to produce desirable outcomes in practice. To address this issue, we introduce MagicBrush (https://osu-nlp-group.github.io/MagicBrush/), the first large-scale, manually annotated dataset for instruction-guided real image editing that covers diverse scenarios: single-turn, multi-turn, mask-provided, and mask-free editing. MagicBrush comprises over 10K manually annotated triplets (source image, instruction, target image), which supports trainining large-scale text-guided image editing models. We fine-tune InstructPix2Pix on MagicBrush and show that the new model can produce much better images according to human evaluation. We further conduct extensive experiments to evaluate current image editing baselines from multiple dimensions including quantitative, qualitative, and human evaluations. The results reveal the challenging nature of our dataset and the gap between current baselines and real-world editing needs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes