CLLGDec 14, 2018

Don't Classify, Translate: Multi-Level E-Commerce Product Categorization Via Machine Translation

arXiv:1812.05774v119 citations
Originality Highly original
AI Analysis

This addresses the problem of multi-level product categorization for e-commerce platforms, offering a novel paradigm that is not incremental.

The paper tackles product categorization in e-commerce by proposing a machine translation approach that converts product descriptions into taxonomy paths, achieving better predictive accuracy than state-of-the-art classification systems on two large datasets.

E-commerce platforms categorize their products into a multi-level taxonomy tree with thousands of leaf categories. Conventional methods for product categorization are typically based on machine learning classification algorithms. These algorithms take product information as input (e.g., titles and descriptions) to classify a product into a leaf category. In this paper, we propose a new paradigm based on machine translation. In our approach, we translate a product's natural language description into a sequence of tokens representing a root-to-leaf path in a product taxonomy. In our experiments on two large real-world datasets, we show that our approach achieves better predictive accuracy than a state-of-the-art classification system for product categorization. In addition, we demonstrate that our machine translation models can propose meaningful new paths between previously unconnected nodes in a taxonomy tree, thereby transforming the taxonomy into a directed acyclic graph (DAG). We discuss how the resultant taxonomy DAG promotes user-friendly navigation, and how it is more adaptable to new products.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes