IR AI CLJun 5, 2025

Exp4Fuse: A Rank Fusion Framework for Enhanced Sparse Retrieval using Large Language Model-based Query Expansion

arXiv:2506.04760v16 citationsh-index: 1ACL

Originality Incremental advance

AI Analysis

This addresses the computational cost and complexity of LLM-based query expansion for information retrieval practitioners, though it appears incremental as it builds on existing query expansion and fusion techniques.

The paper tackles the problem of improving sparse retrieval performance using LLM-based query expansion without costly dense retrieval techniques, introducing Exp4Fuse, a fusion ranking framework that combines original and LLM-augmented query results. Experimental results show it surpasses existing LLM-based query expansion methods and achieves state-of-the-art results on several benchmarks when combined with advanced sparse retrievers.

Large Language Models (LLMs) have shown potential in generating hypothetical documents for query expansion, thereby enhancing information retrieval performance. However, the efficacy of this method is highly dependent on the quality of the generated documents, which often requires complex prompt strategies and the integration of advanced dense retrieval techniques. This can be both costly and computationally intensive. To mitigate these limitations, we explore the use of zero-shot LLM-based query expansion to improve sparse retrieval, particularly for learned sparse retrievers. We introduce a novel fusion ranking framework, Exp4Fuse, which enhances the performance of sparse retrievers through an indirect application of zero-shot LLM-based query expansion. Exp4Fuse operates by simultaneously considering two retrieval routes-one based on the original query and the other on the LLM-augmented query. It then generates two ranked lists using a sparse retriever and fuses them using a modified reciprocal rank fusion method. We conduct extensive evaluations of Exp4Fuse against leading LLM-based query expansion methods and advanced retrieval techniques on three MS MARCO-related datasets and seven low-resource datasets. Experimental results reveal that Exp4Fuse not only surpasses existing LLM-based query expansion methods in enhancing sparse retrievers but also, when combined with advanced sparse retrievers, achieves SOTA results on several benchmarks. This highlights the superior performance and effectiveness of Exp4Fuse in improving query expansion for sparse retrieval.

View on arXiv PDF

Similar