CVMMFeb 28, 2025

DiffBrush:Just Painting the Art by Your Hands

arXiv:2502.20904v1h-index: 7
Originality Incremental advance
AI Analysis

This addresses the challenge for ordinary users who want more precise control in AI painting without costly training.

The paper tackles the problem of text-to-image diffusion models struggling to accurately capture user requirements by introducing DiffBrush, which enables users to draw and edit images through hand-drawn sketches without additional training, achieving control over color, semantics, and object instances.

The rapid development of image generation and editing algorithms in recent years has enabled ordinary user to produce realistic images. However, the current AI painting ecosystem predominantly relies on text-driven diffusion models (T2I), which pose challenges in accurately capturing user requirements. Furthermore, achieving compatibility with other modalities incurs substantial training costs. To this end, we introduce DiffBrush, which is compatible with T2I models and allows users to draw and edit images. By manipulating and adapting the internal representation of the diffusion model, DiffBrush guides the model-generated images to converge towards the user's hand-drawn sketches for user's specific needs without additional training. DiffBrush achieves control over the color, semantic, and instance of objects in images by continuously guiding the latent and instance-level attention map during the denoising process of the diffusion model. Besides, we propose a latent regeneration, which refines the randomly sampled noise in the diffusion model, obtaining a better image generation layout. Finally, users only need to roughly draw the mask of the instance (acceptable colors) on the canvas, DiffBrush can naturally generate the corresponding instance at the corresponding location.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes