A Systematic Study of Pseudo-Relevance Feedback with LLMs
This work provides insights for improving retrieval systems in low-resource settings, though it is incremental in nature.
The paper systematically studies pseudo-relevance feedback (PRF) with large language models (LLMs) by analyzing the independent roles of feedback source and feedback model, finding that feedback model choice is critical, LLM-generated text is cost-effective, and corpus-derived feedback works best with strong first-stage retrievers.
Pseudo-relevance feedback (PRF) methods built on large language models (LLMs) can be organized along two key design dimensions: the feedback source, which is where the feedback text is derived from and the feedback model, which is how the given feedback text is used to refine the query representation. However, the independent role that each dimension plays is unclear, as both are often entangled in empirical evaluations. In this paper, we address this gap by systematically studying how the choice of feedback source and feedback model impact PRF effectiveness through controlled experimentation. Across 13 low-resource BEIR tasks with five LLM PRF methods, our results show: (1) the choice of feedback model can play a critical role in PRF effectiveness; (2) feedback derived solely from LLM-generated text provides the most cost-effective solution; and (3) feedback derived from the corpus is most beneficial when utilizing candidate documents from a strong first-stage retriever. Together, our findings provide a better understanding of which elements in the PRF design space are most important.