CVOct 31, 2017

Clothing Retrieval with Visual Attention Model

arXiv:1710.11446v155 citations
Originality Incremental advance
AI Analysis

This work addresses clothing retrieval for computer vision applications, presenting an incremental improvement over existing methods.

The paper tackles clothing retrieval by proposing a self-learning Visual Attention Model (VAM) with an Impdrop connection, which improves robustness on limited training data compared to trivial methods.

Clothing retrieval is a challenging problem in computer vision. With the advance of Convolutional Neural Networks (CNNs), the accuracy of clothing retrieval has been significantly improved. FashionNet[1], a recent study, proposes to employ a set of artificial features in the form of landmarks for clothing retrieval, which are shown to be helpful for retrieval. However, the landmark detection module is trained with strong supervision which requires considerable efforts to obtain. In this paper, we propose a self-learning Visual Attention Model (VAM) to extract attention maps from clothing images. The VAM is further connected to a global network to form an end-to-end network structure through Impdrop connection which randomly Dropout on the feature maps with the probabilities given by the attention map. Extensive experiments on several widely used benchmark clothing retrieval data sets have demonstrated the promise of the proposed method. We also show that compared to the trivial Product connection, the Impdrop connection makes the network structure more robust when training sets of limited size are used.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes