Deep Neural Networks for Query Expansion using Word Embeddings
This addresses the vocabulary mismatch problem in information retrieval systems, though it appears incremental as it builds on existing embedding-based methods.
The paper tackles the problem of selecting useful terms for query expansion in information retrieval by introducing a neural network classifier that predicts term usefulness using word embeddings. Experiments on four TREC collections show this approach significantly improves retrieval performance compared to baselines, with more robust results.
Query expansion is a method for alleviating the vocabulary mismatch problem present in information retrieval tasks. Previous works have shown that terms selected for query expansion by traditional methods such as pseudo-relevance feedback are not always helpful to the retrieval process. In this paper, we show that this is also true for more recently proposed embedding-based query expansion methods. We then introduce an artificial neural network classifier to predict the usefulness of query expansion terms. This classifier uses term word embeddings as inputs. We perform experiments on four TREC newswire and web collections show that using terms selected by the classifier for expansion significantly improves retrieval performance when compared to competitive baselines. The results are also shown to be more robust than the baselines.