CVAIDec 13, 2024

BrushEdit: All-In-One Image Inpainting and Editing

arXiv:2412.10316v330 citationsh-index: 13
Originality Incremental advance
AI Analysis

This work addresses the problem of enabling autonomous, user-friendly, and interactive free-form image editing for users, representing an incremental improvement over existing inversion-based and instruction-based methods.

The paper tackles the limitations of current image editing methods by proposing BrushEdit, an inpainting-based instruction-guided paradigm that integrates multimodal large language models and a dual-branch inpainting model, achieving superior performance across seven metrics such as mask region preservation and editing effect coherence.

Image editing has advanced significantly with the development of diffusion models using both inversion-based and instruction-based methods. However, current inversion-based approaches struggle with big modifications (e.g., adding or removing objects) due to the structured nature of inversion noise, which hinders substantial changes. Meanwhile, instruction-based methods often constrain users to black-box operations, limiting direct interaction for specifying editing regions and intensity. To address these limitations, we propose BrushEdit, a novel inpainting-based instruction-guided image editing paradigm, which leverages multimodal large language models (MLLMs) and image inpainting models to enable autonomous, user-friendly, and interactive free-form instruction editing. Specifically, we devise a system enabling free-form instruction editing by integrating MLLMs and a dual-branch image inpainting model in an agent-cooperative framework to perform editing category classification, main object identification, mask acquisition, and editing area inpainting. Extensive experiments show that our framework effectively combines MLLMs and inpainting models, achieving superior performance across seven metrics including mask region preservation and editing effect coherence.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes