IRCLAug 27, 2024

MRSE: An Efficient Multi-modality Retrieval System for Large Scale E-commerce

arXiv:2408.14968v12 citationsh-index: 18
Originality Incremental advance
AI Analysis

This work addresses the challenge of unreliable search results in e-commerce due to over-reliance on textual features and suboptimal multi-modality integration, offering a solution for platforms like Shopee.

The paper tackles the problem of improving item recall for text queries in large-scale e-commerce search by addressing limitations in uni-modality and multi-modality retrieval systems, resulting in an 18.9% improvement in offline relevance and a 3.7% gain in online core metrics compared to a state-of-the-art baseline.

Providing high-quality item recall for text queries is crucial in large-scale e-commerce search systems. Current Embedding-based Retrieval Systems (ERS) embed queries and items into a shared low-dimensional space, but uni-modality ERS rely too heavily on textual features, making them unreliable in complex contexts. While multi-modality ERS incorporate various data sources, they often overlook individual preferences for different modalities, leading to suboptimal results. To address these issues, we propose MRSE, a Multi-modality Retrieval System that integrates text, item images, and user preferences through lightweight mixture-of-expert (LMoE) modules to better align features across and within modalities. MRSE also builds user profiles at a multi-modality level and introduces a novel hybrid loss function that enhances consistency and robustness using hard negative sampling. Experiments on a large-scale dataset from Shopee and online A/B testing show that MRSE achieves an 18.9% improvement in offline relevance and a 3.7% gain in online core metrics compared to Shopee's state-of-the-art uni-modality system.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes