CLDec 18, 2018

Predicting user intent from search queries using both CNNs and RNNs

arXiv:1812.07324v13 citations
Originality Synthesis-oriented
AI Analysis

This addresses the problem of user intent prediction for web search applications, but it is incremental as it combines existing methods (CNNs and RNNs) on a known dataset.

The paper tackles predicting user intent from search queries using only query text, without additional data like location or device, by labeling a small part of the EDI dataset and using rule-based methods to automatically label the rest, then training classifiers with CNNs and RNNs and evaluating on ground truth.

Predicting user behaviour on a website is a difficult task, which requires the integration of multiple sources of information, such as geo-location, user profile or web surfing history. In this paper we tackle the problem of predicting the user intent, based on the queries that were used to access a certain webpage. We make no additional assumptions, such as domain detection, device used or location, and only use the word information embedded in the given query. In order to build competitive classifiers, we label a small fraction of the EDI query intent prediction dataset \cite{edi-challenge-dataset}, which is used as ground truth. Then, using various rule-based approaches, we automatically label the rest of the dataset, train the classifiers and evaluate the quality of the automatic labeling on the ground truth dataset. We used both recurrent and convolutional networks as the models, while representing the words in the query with multiple embedding methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes