IRCLOct 3, 2025

Less LLM, More Documents: Searching for Improved RAG

arXiv:2510.02657v21 citationsh-index: 3
Originality Incremental advance
AI Analysis

This addresses cost and deployability issues for RAG users by offering an incremental trade-off between corpus size and model size.

The paper tackles the problem of high cost and limited deployability in Retrieval-Augmented Generation (RAG) by scaling the retriever's corpus instead of the large language model (LLM), finding that larger corpora often substitute for larger models with small- and mid-sized generators rivaling much larger models.

Retrieval-Augmented Generation (RAG) couples document retrieval with large language models (LLMs). While scaling generators improves accuracy, it also raises cost and limits deployability. We explore an orthogonal axis: enlarging the retriever's corpus to reduce reliance on large LLMs. Experimental results show that corpus scaling consistently strengthens RAG and can often serve as a substitute for increasing model size, though with diminishing returns at larger scales. Small- and mid-sized generators paired with larger corpora often rival much larger models with smaller corpora; mid-sized models tend to gain the most, while tiny and large models benefit less. Our analysis shows that improvements arise primarily from increased coverage of answer-bearing passages, while utilization efficiency remains largely unchanged. These findings establish a principled corpus-generator trade-off: investing in larger corpora offers an effective path to stronger RAG, often comparable to enlarging the LLM itself.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes