IRLGJun 26, 2025

EraRAG: Efficient and Incremental Retrieval Augmented Generation for Growing Corpora

arXiv:2506.20963v212 citationsh-index: 8Has Code
Originality Incremental advance
AI Analysis

This work addresses scalability issues for RAG systems in evolving environments, offering a practical solution for applications with continually growing corpora, though it is incremental as it builds on existing Graph-RAG methods.

The paper tackles the problem of inefficient updates in graph-based retrieval-augmented generation (Graph-RAG) for dynamic corpora by introducing EraRAG, a multi-layered framework that uses hyperplane-based LSH for hierarchical graph structures, achieving up to an order of magnitude reduction in update time and token consumption while maintaining high accuracy.

Graph-based Retrieval-Augmented Generation (Graph-RAG) enhances large language models (LLMs) by structuring retrieval over an external corpus. However, existing approaches typically assume a static corpus, requiring expensive full-graph reconstruction whenever new documents arrive, limiting their scalability in dynamic, evolving environments. To address these limitations, we introduce EraRAG, a novel multi-layered Graph-RAG framework that supports efficient and scalable dynamic updates. Our method leverages hyperplane-based Locality-Sensitive Hashing (LSH) to partition and organize the original corpus into hierarchical graph structures, enabling efficient and localized insertions of new data without disrupting the existing topology. The design eliminates the need for retraining or costly recomputation while preserving high retrieval accuracy and low latency. Experiments on large-scale benchmarks demonstrate that EraRag achieves up to an order of magnitude reduction in update time and token consumption compared to existing Graph-RAG systems, while providing superior accuracy performance. This work offers a practical path forward for RAG systems that must operate over continually growing corpora, bridging the gap between retrieval efficiency and adaptability. Our code and data are available at https://github.com/EverM0re/EraRAG-Official.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes