LG IRMay 28, 2022

Automatic Expert Selection for Multi-Scenario and Multi-Task Search

Xinyu Zou, Zhi Hu, Yiming Zhao, Xuchu Ding, Zhongyi Liu, Chenliang Li, Aixin Sun

arXiv:2205.14321v211.147 citationsh-index: 63

Originality Incremental advance

AI Analysis

This work addresses the need for more effective and automated models in search systems to handle diverse user scenarios and tasks, representing an incremental improvement over existing methods like MMoE.

The paper tackles the problem of optimizing multi-scenario and multi-task learning in search services by proposing AESM^2, a framework with automatic expert selection, which demonstrated effectiveness in experiments on real-world datasets and achieved substantial performance gains in online A/B tests.

Multi-scenario learning (MSL) enables a service provider to cater for users' fine-grained demands by separating services for different user sectors, e.g., by user's geographical region. Under each scenario there is a need to optimize multiple task-specific targets e.g., click through rate and conversion rate, known as multi-task learning (MTL). Recent solutions for MSL and MTL are mostly based on the multi-gate mixture-of-experts (MMoE) architecture. MMoE structure is typically static and its design requires domain-specific knowledge, making it less effective in handling both MSL and MTL. In this paper, we propose a novel Automatic Expert Selection framework for Multi-scenario and Multi-task search, named AESM^{2}. AESM^{2} integrates both MSL and MTL into a unified framework with an automatic structure learning. Specifically, AESM^{2} stacks multi-task layers over multi-scenario layers. This hierarchical design enables us to flexibly establish intrinsic connections between different scenarios, and at the same time also supports high-level feature extraction for different tasks. At each multi-scenario/multi-task layer, a novel expert selection algorithm is proposed to automatically identify scenario-/task-specific and shared experts for each input. Experiments over two real-world large-scale datasets demonstrate the effectiveness of AESM^{2} over a battery of strong baselines. Online A/B test also shows substantial performance gain on multiple metrics. Currently, AESM^{2} has been deployed online for serving major traffic.

View on arXiv PDF

Similar