CVAug 31, 2024

EraseDraw: Learning to Draw Step-by-Step via Erasing Objects from Images

arXiv:2409.00522v24 citationsh-index: 14
AI Analysis

This addresses the challenge of generating spatially and optically consistent object insertions for creative image editing applications, though it is incremental by building on existing removal models.

The paper tackled the problem of realistic object insertion in images by inverting object removal to generate training data, and their text-conditioned diffusion model achieved state-of-the-art results in object insertion, particularly for in-the-wild images.

Creative processes such as painting often involve creating different components of an image one by one. Can we build a computational model to perform this task? Prior works often fail by making global changes to the image, inserting objects in unrealistic spatial locations, and generating inaccurate lighting details. We observe that while state-of-the-art models perform poorly on object insertion, they can remove objects and erase the background in natural images very well. Inverting the direction of object removal, we obtain high-quality data for learning to insert objects that are spatially, physically, and optically consistent with the surroundings. With this scalable automatic data generation pipeline, we can create a dataset for learning object insertion, which is used to train our proposed text conditioned diffusion model. Qualitative and quantitative experiments have shown that our model achieves state-of-the-art results in object insertion, particularly for in-the-wild images. We show compelling results on diverse insertion prompts and images across various domains.In addition, we automate iterative insertion by combining our insertion model with beam search guided by CLIP.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes