LG AI IRNov 2, 2025

AGRAG: Advanced Graph-based Retrieval-Augmented Generation for LLMs

Yubo Wang, Haoyang Li, Fei Teng, Lei Chen

arXiv:2511.05549v12 citationsh-index: 10

Originality Incremental advance

AI Analysis

This work addresses retrieval-augmented generation for LLMs, offering incremental improvements in graph-based methods for structured knowledge tasks.

The paper tackles challenges in graph-based retrieval-augmented generation for LLMs, such as inaccurate graph construction and poor reasoning, by proposing AGRAG, which uses a statistics-based method to avoid hallucination and a greedy algorithm for subgraph generation, achieving improvements like reducing noise and enhancing reasoning ability.

Graph-based retrieval-augmented generation (Graph-based RAG) has demonstrated significant potential in enhancing Large Language Models (LLMs) with structured knowledge. However, existing methods face three critical challenges: Inaccurate Graph Construction, caused by LLM hallucination; Poor Reasoning Ability, caused by failing to generate explicit reasons telling LLM why certain chunks were selected; and Inadequate Answering, which only partially answers the query due to the inadequate LLM reasoning, making their performance lag behind NaiveRAG on certain tasks. To address these issues, we propose AGRAG, an advanced graph-based retrieval-augmented generation framework. When constructing the graph, AGRAG substitutes the widely used LLM entity extraction method with a statistics-based method, avoiding hallucination and error propagation. When retrieval, AGRAG formulates the graph reasoning procedure as the Minimum Cost Maximum Influence (MCMI) subgraph generation problem, where we try to include more nodes with high influence score, but with less involving edge cost, to make the generated reasoning paths more comprehensive. We prove this problem to be NP-hard, and propose a greedy algorithm to solve it. The MCMI subgraph generated can serve as explicit reasoning paths to tell LLM why certain chunks were retrieved, thereby making the LLM better focus on the query-related part contents of the chunks, reducing the impact of noise, and improving AGRAG's reasoning ability. Furthermore, compared with the simple tree-structured reasoning paths, our MCMI subgraph can allow more complex graph structures, such as cycles, and improve the comprehensiveness of the generated reasoning paths.

View on arXiv PDF

Similar