IR CLOct 3, 2025

Less LLM, More Documents: Searching for Improved RAG

Jingjie Ning, Yibo Kong, Yunfan Long, Jamie Callan

arXiv:2510.02657v21 citationsh-index: 3

Originality Incremental advance

AI Analysis

This addresses cost and deployability issues for RAG users by offering an incremental trade-off between corpus size and model size.

The paper tackles the problem of high cost and limited deployability in Retrieval-Augmented Generation (RAG) by scaling the retriever's corpus instead of the large language model (LLM), finding that larger corpora often substitute for larger models with small- and mid-sized generators rivaling much larger models.

Retrieval-Augmented Generation (RAG) couples document retrieval with large language models (LLMs). While scaling generators improves accuracy, it also raises cost and limits deployability. We explore an orthogonal axis: enlarging the retriever's corpus to reduce reliance on large LLMs. Experimental results show that corpus scaling consistently strengthens RAG and can often serve as a substitute for increasing model size, though with diminishing returns at larger scales. Small- and mid-sized generators paired with larger corpora often rival much larger models with smaller corpora; mid-sized models tend to gain the most, while tiny and large models benefit less. Our analysis shows that improvements arise primarily from increased coverage of answer-bearing passages, while utilization efficiency remains largely unchanged. These findings establish a principled corpus-generator trade-off: investing in larger corpora offers an effective path to stronger RAG, often comparable to enlarging the LLM itself.

View on arXiv PDF

Similar