CLOct 28, 2024

AutoRAG: Automated Framework for optimization of Retrieval Augmented Generation Pipeline

arXiv:2410.20878v138 citationsh-index: 2Has Code
Originality Incremental advance
AI Analysis

This addresses the challenge for researchers and practitioners in efficiently selecting RAG modules for different datasets, though it appears incremental as it automates existing optimization processes.

The paper tackles the problem of varying performance of Retrieval-Augmented Generation (RAG) modules across datasets by proposing AutoRAG, an automated framework that identifies and optimizes suitable RAG module combinations for specific datasets, with experimental results and data made publicly available.

Using LLMs (Large Language Models) in conjunction with external documents has made RAG (Retrieval-Augmented Generation) an essential technology. Numerous techniques and modules for RAG are being researched, but their performance can vary across different datasets. Finding RAG modules that perform well on specific datasets is challenging. In this paper, we propose the AutoRAG framework, which automatically identifies suitable RAG modules for a given dataset. AutoRAG explores and approximates the optimal combination of RAG modules for the dataset. Additionally, we share the results of optimizing a dataset using AutoRAG. All experimental results and data are publicly available and can be accessed through our GitHub repository https://github.com/Marker-Inc-Korea/AutoRAG_ARAGOG_Paper .

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes