DBAIHCFeb 24, 2025

Graphy'our Data: Towards End-to-End Modeling, Exploring and Generating Report from Raw Data

arXiv:2502.16868v13 citationsh-index: 2SIGMOD Conference Companion
Originality Incremental advance
AI Analysis

This addresses the problem of handling large unstructured document sets for tasks like literature surveys, offering a user-friendly solution, though it appears incremental as it builds on existing LLM and graph-based methods.

The paper tackles the challenge of Progressive Document Investigation by introducing Graphy, an end-to-end platform that automates data modeling, exploration, and high-quality report generation from raw documents, demonstrated with a pre-scrapped graph of over 50,000 papers.

Large Language Models (LLMs) have recently demonstrated remarkable performance in tasks such as Retrieval-Augmented Generation (RAG) and autonomous AI agent workflows. Yet, when faced with large sets of unstructured documents requiring progressive exploration, analysis, and synthesis, such as conducting literature survey, existing approaches often fall short. We address this challenge -- termed Progressive Document Investigation -- by introducing Graphy, an end-to-end platform that automates data modeling, exploration and high-quality report generation in a user-friendly manner. Graphy comprises an offline Scrapper that transforms raw documents into a structured graph of Fact and Dimension nodes, and an online Surveyor that enables iterative exploration and LLM-driven report generation. We showcase a pre-scrapped graph of over 50,000 papers -- complete with their references -- demonstrating how Graphy facilitates the literature-survey scenario. The demonstration video can be found at https://youtu.be/uM4nzkAdGlM.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes