Topic Level Disambiguation for Weak Queries
This addresses the challenge of providing consistent results for users with weak queries in information retrieval systems, though it is incremental as it builds on existing topic-based approaches.
The study tackled the problem of poor search results from incomplete or ambiguous queries by implementing a novel topic detection method using Wikipedia's language model and structural knowledge, finding that query disambiguation did not improve information retrieval as expected.
Despite limited success, information retrieval (IR) systems today are not intelligent or reliable. IR systems return poor search results when users formulate their information needs into incomplete or ambiguous queries (i.e., weak queries). Therefore, one of the main challenges in modern IR research is to provide consistent results across all queries by improving the performance on weak queries. However, existing IR approaches such as query expansion are not overly effective because they make little effort to analyze and exploit the meanings of the queries. Furthermore, word sense disambiguation approaches, which rely on textual context, are ineffective against weak queries that are typically short. Motivated by the demand for a robust IR system that can consistently provide highly accurate results, the proposed study implemented a novel topic detection that leveraged both the language model and structural knowledge of Wikipedia and systematically evaluated the effect of query disambiguation and topic-based retrieval approaches on TREC collections. The results not only confirm the effectiveness of the proposed topic detection and topic-based retrieval approaches but also demonstrate that query disambiguation does not improve IR as expected.