IRCLJul 24, 2019

Generic Intent Representation in Web Search

arXiv:1907.10710v160 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of sparse tail search traffic for search engines, though it is incremental as it builds on existing representation methods.

The paper tackles the problem of representing user intent in web search by introducing GEN Encoder, which learns embeddings from large-scale click logs and achieves robust improvements in query intent similarity modeling, reducing unseen queries by half.

This paper presents GEneric iNtent Encoder (GEN Encoder) which learns a distributed representation space for user intent in search. Leveraging large scale user clicks from Bing search logs as weak supervision of user intent, GEN Encoder learns to map queries with shared clicks into similar embeddings end-to-end and then finetunes on multiple paraphrase tasks. Experimental results on an intrinsic evaluation task - query intent similarity modeling - demonstrate GEN Encoder's robust and significant advantages over previous representation methods. Ablation studies reveal the crucial role of learning from implicit user feedback in representing user intent and the contributions of multi-task learning in representation generality. We also demonstrate that GEN Encoder alleviates the sparsity of tail search traffic and cuts down half of the unseen queries by using an efficient approximate nearest neighbor search to effectively identify previous queries with the same search intent. Finally, we demonstrate distances between GEN encodings reflect certain information seeking behaviors in search sessions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes