IRAIJan 30, 2013

Query Expansion in Information Retrieval Systems using a Bayesian Network-Based Thesaurus

arXiv:1301.7364v137 citations
Originality Synthesis-oriented
AI Analysis

This addresses the challenge of enhancing document retrieval for users in IR systems, but it is incremental as it applies an existing technique (Bayesian networks) to a known problem.

The paper tackles the problem of improving information retrieval effectiveness by developing a query expansion method using Bayesian networks to construct a collection-specific thesaurus, and reports results on three standard test collections.

Information Retrieval (IR) is concerned with the identification of documents in a collection that are relevant to a given information need, usually represented as a query containing terms or keywords, which are supposed to be a good description of what the user is looking for. IR systems may improve their effectiveness (i.e., increasing the number of relevant documents retrieved) by using a process of query expansion, which automatically adds new terms to the original query posed by an user. In this paper we develop a method of query expansion based on Bayesian networks. Using a learning algorithm, we construct a Bayesian network that represents some of the relationships among the terms appearing in a given document collection; this network is then used as a thesaurus (specific for that collection). We also report the results obtained by our method on three standard test collections.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes