AIMAJan 21

Query-Efficient Agentic Graph Extraction Attacks on GraphRAG Systems

arXiv:2601.14662v12 citationsh-index: 3
Originality Highly original
AI Analysis

This work exposes a vulnerability in modern GraphRAG systems, showing they are highly susceptible to structured extraction attacks, which is a security concern for users relying on these systems for sensitive data like medical or agricultural information.

The paper tackled the problem of query-efficient reconstruction of hidden graph structures in GraphRAG systems under realistic query budgets, achieving recovery of up to 90% of entities and relationships while maintaining high precision.

Graph-based retrieval-augmented generation (GraphRAG) systems construct knowledge graphs over document collections to support multi-hop reasoning. While prior work shows that GraphRAG responses may leak retrieved subgraphs, the feasibility of query-efficient reconstruction of the hidden graph structure remains unexplored under realistic query budgets. We study a budget-constrained black-box setting where an adversary adaptively queries the system to steal its latent entity-relation graph. We propose AGEA (Agentic Graph Extraction Attack), a framework that leverages a novelty-guided exploration-exploitation strategy, external graph memory modules, and a two-stage graph extraction pipeline combining lightweight discovery with LLM-based filtering. We evaluate AGEA on medical, agriculture, and literary datasets across Microsoft-GraphRAG and LightRAG systems. Under identical query budgets, AGEA significantly outperforms prior attack baselines, recovering up to 90% of entities and relationships while maintaining high precision. These results demonstrate that modern GraphRAG systems are highly vulnerable to structured, agentic extraction attacks, even under strict query limits.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes