Towards a Search Engine for Machines: Unified Ranking for Multiple Retrieval-Augmented Large Language Models
This work addresses the challenge of building efficient search engines for machines, particularly for AI systems, but it appears incremental as it builds on existing RAG frameworks with a unified approach.
The paper tackles the problem of developing a unified retrieval engine for multiple retrieval-augmented generation (RAG) systems, introducing uRAG, which standardizes training and serves 18 RAG systems for tasks like question answering and fact verification, resulting in a large-scale experimentation ecosystem that addresses fundamental research questions.
This paper introduces uRAG--a framework with a unified retrieval engine that serves multiple downstream retrieval-augmented generation (RAG) systems. Each RAG system consumes the retrieval results for a unique purpose, such as open-domain question answering, fact verification, entity linking, and relation extraction. We introduce a generic training guideline that standardizes the communication between the search engine and the downstream RAG systems that engage in optimizing the retrieval model. This lays the groundwork for us to build a large-scale experimentation ecosystem consisting of 18 RAG systems that engage in training and 18 unknown RAG systems that use the uRAG as the new users of the search engine. Using this experimentation ecosystem, we answer a number of fundamental research questions that improve our understanding of promises and challenges in developing search engines for machines.