CVJul 18, 2023

AnyDoor: Zero-shot Object-level Image Customization

arXiv:2307.09481v2475 citationsh-index: 41
AI Analysis

This work addresses the challenge of seamlessly integrating objects into diverse scenes for users in fields like virtual try-on and object moving, representing a novel method for a known bottleneck.

The authors tackled the problem of zero-shot object-level image customization by developing AnyDoor, a diffusion-based model that teleports objects to new scenes harmoniously, achieving superior performance over existing alternatives and demonstrating potential in applications like virtual try-on.

This work presents AnyDoor, a diffusion-based image generator with the power to teleport target objects to new scenes at user-specified locations in a harmonious way. Instead of tuning parameters for each object, our model is trained only once and effortlessly generalizes to diverse object-scene combinations at the inference stage. Such a challenging zero-shot setting requires an adequate characterization of a certain object. To this end, we complement the commonly used identity feature with detail features, which are carefully designed to maintain texture details yet allow versatile local variations (e.g., lighting, orientation, posture, etc.), supporting the object in favorably blending with different surroundings. We further propose to borrow knowledge from video datasets, where we can observe various forms (i.e., along the time axis) of a single object, leading to stronger model generalizability and robustness. Extensive experiments demonstrate the superiority of our approach over existing alternatives as well as its great potential in real-world applications, such as virtual try-on and object moving. Project page is https://damo-vilab.github.io/AnyDoor-Page/.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes