CV AIMay 7, 2023

Data Efficient Training with Imbalanced Label Sample Distribution for Fashion Detection

Xin Shen, Praful Agrawal, Zhongwei Cheng

arXiv:2305.04379v55.96 citations

Originality Incremental advance

AI Analysis

This work addresses data efficiency for E-commerce applications like fashion detection, but it is incremental as it builds on existing weighting techniques.

The paper tackles the problem of training multi-label classification models with imbalanced data distributions, such as in fashion attribute detection, by proposing a new weighted objective function that improves performance over non-weighted and inverse-frequency-based methods.

Multi-label classification models have a wide range of applications in E-commerce, including visual-based label predictions and language-based sentiment classifications. A major challenge in achieving satisfactory performance for these tasks in the real world is the notable imbalance in data distribution. For instance, in fashion attribute detection, there may be only six 'puff sleeve' clothes among 1000 products in most E-commerce fashion catalogs. To address this issue, we explore more data-efficient model training techniques rather than acquiring a huge amount of annotations to collect sufficient samples, which is neither economic nor scalable. In this paper, we propose a state-of-the-art weighted objective function to boost the performance of deep neural networks (DNNs) for multi-label classification with long-tailed data distribution. Our experiments involve image-based attribute classification of fashion apparels, and the results demonstrate favorable performance for the new weighting method compared to non-weighted and inverse-frequency-based weighting mechanisms. We further evaluate the robustness of the new weighting mechanism using two popular fashion attribute types in today's fashion industry: sleevetype and archetype.

View on arXiv PDF

Similar