NER-Luxury: Named entity recognition for the fashion and luxury domain
This addresses entity recognition challenges for the fashion and luxury domain, which is incremental as it applies existing methods to a new, specialized dataset.
The study tackled named entity recognition for the fashion and luxury industry by creating a dataset of over 40K sentences with a taxonomy of 36+ entity types and developing five fine-tuned models, achieving promising results compared to state-of-the-art open-source large language models.
In this study, we address multiple challenges of developing a named-entity recognition model in English for the fashion and luxury industry, namely the entity disambiguation, French technical jargon in multiple sub-sectors, scarcity of the ESG methodology, and a disparate company structures of the sector with small and medium-sized luxury houses to large conglomerate leveraging economy of scale. In this work, we introduce a taxonomy of 36+ entity types with a luxury-oriented annotation scheme, and create a dataset of more than 40K sentences respecting a clear hierarchical classification. We also present five supervised fine-tuned models NER-Luxury for fashion, beauty, watches, jewelry, fragrances, cosmetics, and overall luxury, focusing equally on the aesthetic side and the quantitative side. In an additional experiment, we compare in a quantitative empirical assessment of the NER performance of our models against the state-of-the-art open-source large language models that show promising results and highlights the benefits of incorporating a bespoke NER model in existing machine learning pipelines.