OpenRAG: Optimizing RAG End-to-End via In-Context Retrieval Learning
This addresses the problem of suboptimal retrieval in RAG systems for AI and NLP practitioners, offering a cost-effective solution that is incremental in improving existing methods.
The paper tackles the inconsistency of learned relevance in retrieval-augmented generation (RAG) scenarios by introducing OpenRAG, a framework that optimizes the retriever end-to-end to capture in-context relevance, resulting in a 4.0% improvement over the original retriever and outperforming state-of-the-art retrievers by 2.1%.
In this paper, we analyze and empirically show that the learned relevance for conventional information retrieval (IR) scenarios may be inconsistent in retrieval-augmented generation (RAG) scenarios. To bridge this gap, we introduce OpenRAG, a RAG framework that is optimized end-to-end by tuning the retriever to capture in-context relevance, enabling adaptation to the diverse and evolving needs. Extensive experiments across a wide range of tasks demonstrate that OpenRAG, by tuning a retriever end-to-end, leads to a consistent improvement of 4.0% over the original retriever, consistently outperforming existing state-of-the-art retrievers by 2.1%. Additionally, our results indicate that for some tasks, an end-to-end tuned 0.2B retriever can achieve improvements that surpass those of RAG-oriented or instruction-tuned 8B large language models (LLMs), highlighting the cost-effectiveness of our approach in enhancing RAG systems.