ObjectClear: Complete Object Removal via Object-Effect Attention
This addresses the challenge of accurate object-effect removal in image editing for users, though it appears incremental as it builds on existing diffusion-based methods with a novel attention mechanism.
The paper tackles the problem of object removal including associated effects like shadows and reflections, introducing a new dataset OBER and a framework ObjectClear that outperforms existing methods with improved removal quality and background fidelity.
Object removal requires eliminating not only the target object but also its effects, such as shadows and reflections. However, diffusion-based inpainting methods often produce artifacts, hallucinate content, alter background, and struggle to remove object effects accurately. To address this challenge, we introduce a new dataset for OBject-Effect Removal, named OBER, which provides paired images with and without object effects, along with precise masks for both objects and their associated visual artifacts. The dataset comprises high-quality captured and simulated data, covering diverse object categories and complex multi-object scenes. Building on OBER, we propose a novel framework, ObjectClear, which incorporates an object-effect attention mechanism to guide the model toward the foreground removal regions by learning attention masks, effectively decoupling foreground removal from background reconstruction. Furthermore, the predicted attention map enables an attention-guided fusion strategy during inference, greatly preserving background details. Extensive experiments demonstrate that ObjectClear outperforms existing methods, achieving improved object-effect removal quality and background fidelity, especially in complex scenarios.