CVMay 30, 2019

Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language Feedback

arXiv:1905.12794v384 citations
Originality Synthesis-oriented
AI Analysis

It addresses the need for more natural conversational interfaces in retail fashion, though it is incremental as it builds on existing retrieval methods with a new dataset.

The paper introduces the Fashion IQ dataset, the first fashion dataset with human-generated captions distinguishing similar garment images, and presents a transformer-based interactive image retriever that improves state-of-the-art performance in dialog-based image retrieval.

Conversational interfaces for the detail-oriented retail fashion domain are more natural, expressive, and user friendly than classical keyword-based search interfaces. In this paper, we introduce the Fashion IQ dataset to support and advance research on interactive fashion image retrieval. Fashion IQ is the first fashion dataset to provide human-generated captions that distinguish similar pairs of garment images together with side-information consisting of real-world product descriptions and derived visual attribute labels for these images. We provide a detailed analysis of the characteristics of the Fashion IQ data, and present a transformer-based user simulator and interactive image retriever that can seamlessly integrate visual attributes with image features, user feedback, and dialog history, leading to improved performance over the state of the art in dialog-based image retrieval. We believe that our dataset will encourage further work on developing more natural and real-world applicable conversational shopping assistants.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes