IRCLLGApr 23, 2021

DeepCAT: Deep Category Representation for Query Understanding in E-commerce Search

arXiv:2104.11760v23 citations
Originality Incremental advance
AI Analysis

This work addresses query understanding challenges in e-commerce search, particularly for minority categories and tail queries, representing an incremental improvement in a domain-specific application.

The paper tackled the problem of mapping search queries to product categories in e-commerce search, addressing class imbalance and poor representation of tail queries, and achieved a 10% improvement on minority classes and a 7.1% improvement on tail queries over a state-of-the-art model.

Mapping a search query to a set of relevant categories in the product taxonomy is a significant challenge in e-commerce search for two reasons: 1) Training data exhibits severe class imbalance problem due to biased click behavior, and 2) queries with little customer feedback (e.g., tail queries) are not well-represented in the training set, and cause difficulties for query understanding. To address these problems, we propose a deep learning model, DeepCAT, which learns joint word-category representations to enhance the query understanding process. We believe learning category interactions helps to improve the performance of category mapping on minority classes, tail and torso queries. DeepCAT contains a novel word-category representation model that trains the category representations based on word-category co-occurrences in the training set. The category representation is then leveraged to introduce a new loss function to estimate the category-category co-occurrences for refining joint word-category embeddings. To demonstrate our model's effectiveness on minority categories and tail queries, we conduct two sets of experiments. The results show that DeepCAT reaches a 10% improvement on minority classes and a 7.1% improvement on tail queries over a state-of-the-art label embedding model. Our findings suggest a promising direction for improving e-commerce search by semantic modeling of taxonomy hierarchies.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes