IROct 9, 2021

Lookup or Exploratory: What is Your Search Intent?

arXiv:2110.04640v13 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of real-time query specificity identification for commercial search engines, though it appears incremental as it builds on existing intent classification methods.

The paper tackles the problem of classifying search queries as either Exploratory or Lookup intent at runtime to improve search results and suggestions, achieving high accuracy and a response time of less than one millisecond.

Search query specificity is broadly divided into two categories - Exploratory or Lookup. If a query specificity can be identified at the run time, it can be used to significantly improve the search results as well as quality of suggestions to alter the query. However, with millions of queries coming every day on a commercial search engine, it is non-trivial to develop a horizontal technique to determine query specificity at run time. Existing techniques suffer either from lack of enough training data or are dependent on information such as query length or session information. In this paper, we show that such methodologies are inadequate or at times misleading. We propose a novel methodology, to overcome these limitations. First, we demonstrate a heuristic-based method to identify Exploratory or Lookup intent queries at scale, classifying millions of queries into the two classes with a high accuracy, as shown in our experiments. Our methodology is not dependent on session data or on query length. Next, we train a transformer-based deep neural network to classify the queries into one of the two classes at run time. Our method uses a bidirectional GRU initialized with pretrained BERT-base-uncased embeddings and an augmented triplet loss to classify the intent of queries without using any session data. We also introduce a novel Semi-Greedy Iterative Training approach to fine-tune our model. Our model is deployable for real time query specificity identification with response time of less than one millisecond. Our technique is generic, and the results have valuable implications for improving the quality of search results and suggestions.

View on arXiv PDF

Similar