IR CLJul 1, 2023

Improving Text Matching in E-Commerce Search with A Rationalizable, Intervenable and Fast Entity-Based Relevance Model

Jiong Cai, Yong Jiang, Yue Zhang, Chengyue Jiang, Ke Yu, Jianhui Ji, Rong Xiao, Haihong Tang, Tao Wang, Zhongqiang Huang, Pengjun Xie, Fei Huang

arXiv:2307.00370v23.51 citationsh-index: 42

Originality Incremental advance

AI Analysis

This work addresses the problem of fast and accurate relevance prediction for e-commerce search systems, offering an incremental improvement over existing models like Bi-encoder and Cross-encoder.

The paper tackles the trade-off between accuracy and inference speed in e-commerce search relevance models by proposing an Entity-Based Relevance Model (EBRM), which decomposes query-item relevance into query-entity problems using a Cross-encoder for accuracy and caching for speed, achieving promising improvements with computational efficiency.

Discovering the intended items of user queries from a massive repository of items is one of the main goals of an e-commerce search system. Relevance prediction is essential to the search system since it helps improve performance. When online serving a relevance model, the model is required to perform fast and accurate inference. Currently, the widely used models such as Bi-encoder and Cross-encoder have their limitations in accuracy or inference speed respectively. In this work, we propose a novel model called the Entity-Based Relevance Model (EBRM). We identify the entities contained in an item and decompose the QI (query-item) relevance problem into multiple QE (query-entity) relevance problems; we then aggregate their results to form the QI prediction using a soft logic formulation. The decomposition allows us to use a Cross-encoder QE relevance module for high accuracy as well as cache QE predictions for fast online inference. Utilizing soft logic makes the prediction procedure interpretable and intervenable. We also show that pretraining the QE module with auto-generated QE data from user logs can further improve the overall performance. The proposed method is evaluated on labeled data from e-commerce websites. Empirical results show that it achieves promising improvements with computation efficiency.

View on arXiv PDF

Similar