LGAICLIRNov 6, 2025

Ground-Truth Subgraphs for Better Training and Evaluation of Knowledge Graph Augmented LLMs

arXiv:2511.04473v11 citationsh-index: 9
AI Analysis

This addresses the problem of difficult method comparison for researchers in knowledge graph-augmented LLMs, though it is incremental as it builds on existing retrieval approaches.

The paper tackles the lack of challenging QA datasets with ground-truth targets for evaluating knowledge graph retrieval methods by introducing SynthKGQA, a framework for generating synthetic datasets from any knowledge graph, and applies it to create GTSQA, which benchmarks popular KG-augmented LLM solutions.

Retrieval of information from graph-structured knowledge bases represents a promising direction for improving the factuality of LLMs. While various solutions have been proposed, a comparison of methods is difficult due to the lack of challenging QA datasets with ground-truth targets for graph retrieval. We present SynthKGQA, a framework for generating high-quality synthetic Knowledge Graph Question Answering datasets from any Knowledge Graph, providing the full set of ground-truth facts in the KG to reason over each question. We show how, in addition to enabling more informative benchmarking of KG retrievers, the data produced with SynthKGQA also allows us to train better models. We apply SynthKGQA to Wikidata to generate GTSQA, a new dataset designed to test zero-shot generalization abilities of KG retrievers with respect to unseen graph structures and relation types, and benchmark popular solutions for KG-augmented LLMs on it.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes