IRSep 6, 2020

Proximity full-text searches of frequently occurring words with a response time guarantee

arXiv:2009.03679v1
Originality Synthesis-oriented
AI Analysis

This work addresses efficiency issues in information retrieval for search engines when handling high-frequency query terms, representing an incremental improvement.

The paper tackles the problem of proximity full-text searches for frequently occurring words by proposing an indexing method that stores information about nearby words within a MaxDistance parameter, resulting in query execution times that are 94.7 to 45.9 times faster than standard inverted files depending on MaxDistance.

Full-text search engines are important tools for information retrieval. In a proximity full-text search, a document is relevant if it contains query terms near each other, especially if the query terms are frequently occurring words. For each word in the text, we use additional indexes to store information about nearby words at distances from the given word of less than or equal to MaxDistance, which is a parameter. A search algorithm for the case when the query consists of high-frequently used words is discussed. In addition, we present results of experiments with different values of MaxDistance to evaluate the search speed dependence on the value of MaxDistance. These results show that the average time of the query execution with our indexes is 94.7-45.9 times (depending on the value of MaxDistance) less than that with standard inverted files when queries that contain high-frequently occurring words are evaluated. This is a pre-print of a contribution published in Pinelas S., Kim A., Vlasov V. (eds) Mathematical Analysis With Applications. CONCORD-90 2018. Springer Proceedings in Mathematics & Statistics, vol 318, published by Springer, Cham. The final authenticated version is available online at: https://doi.org/10.1007/978-3-030-42176-2_37

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes