From "Identical" to "Similar": Fusing Retrieved Lists Based on Inter-Document Similarities
This work addresses the challenge of improving document retrieval fusion for search systems, but it is incremental as it builds on existing fusion methods by adding similarity-based propagation.
The paper tackles the problem of fusing multiple retrieved document lists for a query by incorporating inter-document similarities, in addition to retrieval scores and ranks, to allow similar documents to support each other's relevance. The result shows that their most effective graph-based method outperforms existing fusion methods on TREC runs.
Methods for fusing document lists that were retrieved in response to a query often utilize the retrieval scores and/or ranks of documents in the lists. We present a novel fusion approach that is based on using, in addition, information induced from inter-document similarities. Specifically, our methods let similar documents from different lists provide relevance-status support to each other. We use a graph-based method to model relevance-status propagation between documents. The propagation is governed by inter-document-similarities and by retrieval scores of documents in the lists. Empirical evaluation demonstrates the effectiveness of our methods in fusing TREC runs. The performance of our most effective methods transcends that of effective fusion methods that utilize only retrieval scores or ranks.