IRJul 2, 2019

Learning to Reformulate the Queries on the WEB

arXiv:1907.01300v1
Originality Incremental advance
AI Analysis

This addresses the issue of poor search results for naive users on web search engines, though it is incremental as it builds on existing sequence-to-sequence models.

The paper tackles the problem of naive users' inability to formulate effective web search queries by proposing an end-to-end encoder-decoder model that automatically generates reformulated queries, using over one billion anchor phrases from the Clueweb09 corpus, and experiments on TREC collections show it significantly improves retrieval performance.

Inability of the naive users to formulate appropriate queries is a fundamental problem in web search engines. Therefore, assisting users to issue more effective queries is an important way to improve users' happiness. One effective approach is query reformulation, which generates new effective queries according to the current query issued by users. Previous researches typically generate words and phrases related to the original query. Since the definition of query reformulation is quite general, it is completely difficult to develop a uniform term-based approach for this problem. This paper uses readily available data, particularly over one billion anchor phrases in Clueweb09 corpus, in order to learn an end-to-end encoder-decoder model to automatically generate effective queries. Following successful researches in the field of sequence to sequence models, we employ a character-level convolutional neural network with max-pooling at encoder and an attention-based recurrent neural network at decoder. The whole model learned in an unsupervised end-to-end manner.Experiments on TREC collections show that the reformulated queries automatically generated by the proposed solution can significantly improve the retrieval performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes