CVApr 5, 2019

Snap and Find: Deep Discrete Cross-domain Garment Image Retrieval

arXiv:1904.02887v17 citations
Originality Incremental advance
AI Analysis

This addresses the need for intelligent search systems in online retail to improve garment image retrieval, though it is incremental by building on existing attribute-based approaches.

The paper tackles the problem of matching customer-snapped garment photos to store-provided product images by proposing a deep multi-task cross-domain hashing method that simultaneously models cross-domain embedding and sequential attribute learning, achieving a 306× efficiency boost compared to state-of-the-art models.

With the increasing number of online stores, there is a pressing need for intelligent search systems to understand the item photos snapped by customers and search against large-scale product databases to find their desired items. However, it is challenging for conventional retrieval systems to match up the item photos captured by customers and the ones officially released by stores, especially for garment images. To bridge the customer- and store- provided garment photos, existing studies have been widely exploiting the clothing attributes (\textit{e.g.,} black) and landmarks (\textit{e.g.,} collar) to learn a common embedding space for garment representations. Unfortunately they omit the sequential correlation of attributes and consume large quantity of human labors to label the landmarks. In this paper, we propose a deep multi-task cross-domain hashing termed \textit{DMCH}, in which cross-domain embedding and sequential attribute learning are modeled simultaneously. Sequential attribute learning not only provides the semantic guidance for embedding, but also generates rich attention on discriminative local details (\textit{e.g.,} black buttons) of clothing items without requiring extra landmark labels. This leads to promising performance and 306$\times$ boost on efficiency when compared with the state-of-the-art models, which is demonstrated through rigorous experiments on two public fashion datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes