IRCLAug 3, 2016

Query Clustering using Segment Specific Context Embeddings

arXiv:1608.01247v21 citations
Originality Synthesis-oriented
AI Analysis

This addresses the problem of monetizing search engine queries for businesses by clustering them into interest areas, though it appears incremental in combining existing NLP and clustering techniques.

The paper tackles the problem of identifying user interest areas from search queries by developing a query clustering approach that uses context embeddings from search results and a scalable clustering algorithm. The method was tested across multiple segments (Retail, Travel, Health, Phones) and found effective for discovering monetizable interest areas.

This paper presents a novel query clustering approach to capture the broad interest areas of users querying search engines. We make use of recent advances in NLP - word2vec and extend it to get query2vec, vector representations of queries, based on query contexts, obtained from the top search results for the query and use a highly scalable Divide & Merge clustering algorithm on top of the query vectors, to get the clusters. We have tried this approach on a variety of segments, including Retail, Travel, Health, Phones and found the clusters to be effective in discovering user's interest areas which have high monetization potential.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes