Learning to Rank for Multiple Retrieval-Augmented Models through Iterative Utility Maximization
This addresses the challenge of personalized retrieval for diverse RAG agents in AI systems, representing an incremental improvement over existing methods.
This paper tackles the problem of designing a unified search engine to serve multiple retrieval-augmented generation (RAG) agents with distinct tasks and strategies, by introducing an iterative approach that optimizes retrieval based on feedback to maximize each agent's utility, resulting in significant average performance improvements over baselines across 18 RAG models on the KILT benchmark.
This paper investigates the design of a unified search engine to serve multiple retrieval-augmented generation (RAG) agents, each with a distinct task, backbone large language model (LLM), and RAG strategy. We introduce an iterative approach where the search engine generates retrieval results for the RAG agents and gathers feedback on the quality of the retrieved documents during an offline phase. This feedback is then used to iteratively optimize the search engine using an expectation-maximization algorithm, with the goal of maximizing each agent's utility function. Additionally, we adapt this to an online setting, allowing the search engine to refine its behavior based on real-time individual agents feedback to better serve the results for each of them. Experiments on datasets from the Knowledge-Intensive Language Tasks (KILT) benchmark demonstrates that our approach significantly on average outperforms baselines across 18 RAG models. We demonstrate that our method effectively ``personalizes'' the retrieval for each RAG agent based on the collected feedback. Finally, we provide a comprehensive ablation study to explore various aspects of our method.