IRAIFeb 14, 2025

ArchRAG: Attributed Community-based Hierarchical Retrieval-Augmented Generation

arXiv:2502.09891v328 citationsh-index: 7
Originality Highly original
AI Analysis

This work addresses the problem of efficient knowledge integration for large language models, particularly for question-answer tasks, which is significant for natural language processing applications.

The authors tackled the issue of inefficient retrieval in graph-based Retrieval-Augmented Generation (RAG) approaches, resulting in improved accuracy and reduced token cost. ArchRAG outperforms existing methods, although specific numbers are not provided.

Retrieval-Augmented Generation (RAG) has proven effective in integrating external knowledge into large language models (LLMs) for solving question-answer (QA) tasks. The state-of-the-art RAG approaches often use the graph data as the external data since they capture the rich semantic information and link relationships between entities. However, existing graph-based RAG approaches cannot accurately identify the relevant information from the graph and also consume large numbers of tokens in the online retrieval process. To address these issues, we introduce a novel graph-based RAG approach, called Attributed Community-based Hierarchical RAG (ArchRAG), by augmenting the question using attributed communities, and also introducing a novel LLM-based hierarchical clustering method. To retrieve the most relevant information from the graph for the question, we build a novel hierarchical index structure for the attributed communities and develop an effective online retrieval method. Experimental results demonstrate that ArchRAG outperforms existing methods in both accuracy and token cost.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes