IR AIApr 15

Agentic GraphRAG: Navigating Unstructured Financial Data with Collaborative AI

arXiv:2605.1877050.1

Predicted impact top 52% in IR · last 90 daysOriginality Incremental advance

AI Analysis

This work addresses the practical challenge of navigating unstructured financial data in public registries for expert analysts, offering a modular and transferable solution that significantly improves retrieval and conversational performance over existing methods.

The paper presents a collaborative agentic GraphRAG framework for analyzing commercial registry data, combining structured and unstructured information via a Neo4j knowledge graph and modular agents. On the Swiss Official Gazette corpus, it outperforms a standard vector-RAG baseline across multiple metrics including correctness, relevance, recall, turn success rate, and context carryover accuracy.

We present a collaborative agentic GraphRAG framework for expert analysis of commercial registry data. Public registries are often formally accessible, yet difficult to use in practice because they combine structured records with large volumes of unstructured legal text. This limits conventional keyword and vector-only retrieval, especially for multi-hop, temporal, and entity-centric investigations. Our approach builds a Neo4j knowledge graph through a three-phase pipeline: (i) deterministic ingestion of strong nodes from verified structured fields, (ii) LLM-based extraction of weak nodes from unstructured notices, and (iii) deterministic identity resolution and deduplication. On top of this graph, we introduce an analytical modular agent that integrates zero-shot intent routing, a bounded reflection loop, secure tool-mediated graph access, and state-aware response synthesis. A human-in-the-loop dashboard exposes evidence and execution traces to support transparency and auditability. We evaluate the framework on the Swiss Official Gazette of Commerce, a multilingual corpus of more than seven million publications over seven years. We further contribute a multi-tier evaluation protocol covering entity-resolution precision, tool-routing behavior, answer quality, and multi-turn conversational performance. Across automated, human-curated, and conversational benchmarks, the proposed agentic GraphRAG system consistently outperforms a standard agentic vector-RAG baseline, with strong gains in correctness, answer relevance, information recall, turn success rate, and context carryover accuracy. The architecture is modular, reproducible, and transferable to other commercial gazettes and public-sector registry systems.

View on arXiv PDF

Similar