CVAIIRLGDec 27, 2022

Attribute-Guided Multi-Level Attention Network for Fine-Grained Fashion Retrieval

arXiv:2301.13014v28 citationsh-index: 39Has Code
Originality Incremental advance
AI Analysis

This work addresses the challenge of retrieving similar fashion items based on attributes, which is important for e-commerce applications, but it is incremental as it builds on existing attention-based methods.

The paper tackles the feature gap problem in fine-grained fashion retrieval by introducing an attribute-guided multi-level attention network (AG-MAN), which improves performance over existing methods, achieving 62.8788% MAP on FashionAI, 8.9804% MAP on DeepFashion, and 93.32% prediction accuracy on Zappos50k.

Fine-grained fashion retrieval searches for items that share a similar attribute with the query image. Most existing methods use a pre-trained feature extractor (e.g., ResNet 50) to capture image representations. However, a pre-trained feature backbone is typically trained for image classification and object detection, which are fundamentally different tasks from fine-grained fashion retrieval. Therefore, existing methods suffer from a feature gap problem when directly using the pre-trained backbone for fine-tuning. To solve this problem, we introduce an attribute-guided multi-level attention network (AG-MAN). Specifically, we first enhance the pre-trained feature extractor to capture multi-level image embedding, thereby enriching the low-level features within these representations. Then, we propose a classification scheme where images with the same attribute, albeit with different values, are categorized into the same class. This can further alleviate the feature gap problem by perturbing object-centric feature learning. Moreover, we propose an improved attribute-guided attention module for extracting more accurate attribute-specific representations. Our model consistently outperforms existing attention based methods when assessed on the FashionAI (62.8788% in MAP), DeepFashion (8.9804% in MAP), and Zappos50k datasets (93.32% in Prediction accuracy). Especially, ours improves the most typical ASENet_V2 model by 2.12%, 0.31%, and 0.78% points in FashionAI, DeepFashion, and Zappos50k datasets, respectively. The source code is available in https://github.com/Dr-LingXiao/AG-MAN.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes