CVAug 21, 2024

AnyDesign: Versatile Area Fashion Editing via Mask-Free Diffusion

arXiv:2408.11553v42 citationsh-index: 7
Originality Incremental advance
AI Analysis

This addresses the need for more adaptable fashion editing tools in real-world scenarios, though it is incremental as it builds on existing diffusion models and dataset extensions.

The paper tackles the problem of flexible fashion image editing without requiring auxiliary tools like segmenters, by proposing AnyDesign, a diffusion-based method that enables mask-free editing on versatile areas and outperforms contemporary text-guided methods in quality.

Fashion image editing aims to modify a person's appearance based on a given instruction. Existing methods require auxiliary tools like segmenters and keypoint extractors, lacking a flexible and unified framework. Moreover, these methods are limited in the variety of clothing types they can handle, as most datasets focus on people in clean backgrounds and only include generic garments such as tops, pants, and dresses. These limitations restrict their applicability in real-world scenarios. In this paper, we first extend an existing dataset for human generation to include a wider range of apparel and more complex backgrounds. This extended dataset features people wearing diverse items such as tops, pants, dresses, skirts, headwear, scarves, shoes, socks, and bags. Additionally, we propose AnyDesign, a diffusion-based method that enables mask-free editing on versatile areas. Users can simply input a human image along with a corresponding prompt in either text or image format. Our approach incorporates Fashion DiT, equipped with a Fashion-Guidance Attention (FGA) module designed to fuse explicit apparel types and CLIP-encoded apparel features. Both Qualitative and quantitative experiments demonstrate that our method delivers high-quality fashion editing and outperforms contemporary text-guided fashion editing methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes