Knowledge-based Query Expansion in Real-Time Microblog Search
This addresses the challenge of improving retrieval performance for short microblog texts like tweets, which is an incremental advancement in domain-specific information retrieval.
The paper tackled the vocabulary mismatch problem in real-time microblog search by proposing a knowledge-based query expansion method using Freebase and temporal evidence, which demonstrated significant superiority over baseline methods on TREC Twitter corpora.
Since the length of microblog texts, such as tweets, is strictly limited to 140 characters, traditional Information Retrieval techniques suffer from the vocabulary mismatch problem severely and cannot yield good performance in the context of microblogosphere. To address this critical challenge, in this paper, we propose a new language modeling approach for microblog retrieval by inferring various types of context information. In particular, we expand the query using knowledge terms derived from Freebase so that the expanded one can better reflect users' search intent. Besides, in order to further satisfy users' real-time information need, we incorporate temporal evidences into the expansion method, which can boost recent tweets in the retrieval results with respect to a given topic. Experimental results on two official TREC Twitter corpora demonstrate the significant superiority of our approach over baseline methods.