IRJun 23, 2016

Selective Term Proximity Scoring Via BP-ANN

arXiv:1606.07188v16 citations
Originality Incremental advance
AI Analysis

This work addresses efficiency issues in information retrieval for search engines, but it is incremental as it builds on existing term proximity techniques.

The paper tackles the problem of term proximity scoring slowing down query processing by proposing a model that selectively applies proximity-based ranking only when beneficial, based on query features. Experiments show the model improves rankings and reduces overhead.

When two terms occur together in a document, the probability of a close relationship between them and the document itself is greater if they are in nearby positions. However, ranking functions including term proximity (TP) require larger indexes than traditional document-level indexing, which slows down query processing. Previous studies also show that this technique is not effective for all types of queries. Here we propose a document ranking model which decides for which queries it would be beneficial to use a proximity-based ranking, based on a collection of features of the query. We use a machine learning approach in determining whether utilizing TP will be beneficial. Experiments show that the proposed model returns improved rankings while also reducing the overhead incurred as a result of using TP statistics.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes