IRCLFeb 26, 2024

Retrieval Augmented Generation Systems: Automatic Dataset Creation, Evaluation and Boolean Agent Setup

arXiv:2403.00820v16 citationsh-index: 4Has Code
Originality Synthesis-oriented
AI Analysis

This work addresses the need for standardized evaluation in RAG systems for researchers and practitioners, though it is incremental as it builds on existing RAG concepts.

The paper tackles the lack of quantitative comparisons for Retrieval Augmented Generation (RAG) systems by developing a rigorous dataset creation and evaluation workflow, and uses this to create a boolean agent RAG setup that saves tokens by deciding when to query a database, achieving unspecified performance gains.

Retrieval Augmented Generation (RAG) systems have seen huge popularity in augmenting Large-Language Model (LLM) outputs with domain specific and time sensitive data. Very recently a shift is happening from simple RAG setups that query a vector database for additional information with every user input to more sophisticated forms of RAG. However, different concrete approaches compete on mostly anecdotal evidence at the moment. In this paper we present a rigorous dataset creation and evaluation workflow to quantitatively compare different RAG strategies. We use a dataset created this way for the development and evaluation of a boolean agent RAG setup: A system in which a LLM can decide whether to query a vector database or not, thus saving tokens on questions that can be answered with internal knowledge. We publish our code and generated dataset online.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes