CRAIIRAug 24, 2025

Exposing Privacy Risks in Graph Retrieval-Augmented Generation

arXiv:2508.17222v12 citationsh-index: 8
Originality Incremental advance
AI Analysis

This work addresses privacy vulnerabilities in Graph RAG systems, which is important for developers and users of these AI systems, though it appears to be an incremental analysis of a specific attack surface.

This paper investigates privacy risks in Graph Retrieval-Augmented Generation (RAG) systems, finding that while they reduce raw text leakage, they are significantly more vulnerable to extraction of structured entity and relationship information.

Retrieval-Augmented Generation (RAG) is a powerful technique for enhancing Large Language Models (LLMs) with external, up-to-date knowledge. Graph RAG has emerged as an advanced paradigm that leverages graph-based knowledge structures to provide more coherent and contextually rich answers. However, the move from plain document retrieval to structured graph traversal introduces new, under-explored privacy risks. This paper investigates the data extraction vulnerabilities of the Graph RAG systems. We design and execute tailored data extraction attacks to probe their susceptibility to leaking both raw text and structured data, such as entities and their relationships. Our findings reveal a critical trade-off: while Graph RAG systems may reduce raw text leakage, they are significantly more vulnerable to the extraction of structured entity and relationship information. We also explore potential defense mechanisms to mitigate these novel attack surfaces. This work provides a foundational analysis of the unique privacy challenges in Graph RAG and offers insights for building more secure systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes