CL IRFeb 5

A Human-in-the-Loop, LLM-Centered Architecture for Knowledge-Graph Question Answering

Larissa Pusch, Alexandre Courtiol, Tim Conrad

arXiv:2602.05512v21.11 citationsh-index: 31

Originality Incremental advance

AI Analysis

This work addresses the challenge of making knowledge graphs more accessible to users without query language expertise, which is a significant problem for researchers and data analysts working with complex datasets.

This paper introduces an interactive framework that allows users to refine Cypher graph queries generated by LLMs using natural language. The framework was evaluated on a 90-query synthetic movie KG benchmark, demonstrating improved accessibility to complex datasets while maintaining factual accuracy and semantic rigor.

Large Language Models (LLMs) excel at language understanding but remain limited in knowledge-intensive domains due to hallucinations, outdated information, and limited explainability. Text-based retrieval-augmented generation (RAG) helps ground model outputs in external sources but struggles with multi-hop reasoning. Knowledge Graphs (KGs), in contrast, support precise, explainable querying, yet require a knowledge of query languages. This work introduces an interactive framework in which LLMs generate and explain Cypher graph queries and users iteratively refine them through natural language. Applied to real-world KGs, the framework improves accessibility to complex datasets while preserving factual accuracy and semantic rigor and provides insight into how model performance varies across domains. Our core quantitative evaluation is a 90-query benchmark on a synthetic movie KG that measures query explanation quality and fault detection across multiple LLMs, complemented by two smaller real-life query-generation experiments on a Hyena KG and the MaRDI (Mathematical Research Data Initiative) KG.

View on arXiv PDF

Similar