CVIRApr 10

FashionStylist: An Expert Knowledge-enhanced Multimodal Dataset for Fashion Understanding

arXiv:2604.0924937.9
AI Analysis

This addresses the need for comprehensive fashion understanding datasets for researchers and developers in AI and fashion technology, though it is incremental as it builds on existing data collection efforts.

The paper tackles the problem of fragmented and task-specific fashion datasets by introducing FashionStylist, an expert-annotated benchmark for holistic fashion understanding, which supports tasks like outfit-to-item grounding and completion, and serves as a unified benchmark and training resource for MLLM-based systems.

Fashion understanding requires both visual perception and expert-level reasoning about style, occasion, compatibility, and outfit rationale. However, existing fashion datasets remain fragmented and task-specific, often focusing on item attributes, outfit co-occurrence, or weak textual supervision, and thus provide limited support for holistic outfit understanding. In this paper, we introduce FashionStylist, an expert-annotated benchmark for holistic and expert-level fashion understanding. Constructed through a dedicated fashion-expert annotation pipeline, FashionStylist provides professionally grounded annotations at both the item and outfit levels. It supports three representative tasks: outfit-to-item grounding, outfit completion, and outfit evaluation. These tasks cover realistic item recovery from complex outfits with layering and accessories, compatibility-aware composition beyond co-occurrence matching, and expert-level assessment of style, season, occasion, and overall coherence. Experimental results show that FashionStylist serves not only as a unified benchmark for multiple fashion tasks, but also as an effective training resource for improving grounding, completion, and outfit-level semantic evaluation in MLLM-based fashion systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes