Interpretable Methods for Identifying Product Variants
This addresses the need for e-commerce companies to group products effectively for better customer experiences and brand management, though it is incremental as it builds on existing methods with specific adaptations.
The paper tackles the problem of identifying product variants in e-commerce by introducing a method that combines constrained clustering and tailored NLP techniques, achieving superior performance compared to a baseline vanilla classification approach.
For e-commerce companies with large product selections, the organization and grouping of products in meaningful ways is important for creating great customer shopping experiences and cultivating an authoritative brand image. One important way of grouping products is to identify a family of product variants, where the variants are mostly the same with slight and yet distinct differences (e.g. color or pack size). In this paper, we introduce a novel approach to identifying product variants. It combines both constrained clustering and tailored NLP techniques (e.g. extraction of product family name from unstructured product title and identification of products with similar model numbers) to achieve superior performance compared with an existing baseline using a vanilla classification approach. In addition, we design the algorithm to meet certain business criteria, including meeting high accuracy requirements on a wide range of categories (e.g. appliances, decor, tools, and building materials, etc.) as well as prioritizing the interpretability of the model to make it accessible and understandable to all business partners.