Can LLMs be Good Graph Judge for Knowledge Graph Construction?
This addresses the problem of noisy and inaccurate KG construction from real-world documents for applications in information retrieval and knowledge management, though it appears incremental as it builds on existing LLM-based methods.
The paper tackles the challenge of converting noisy, unstructured text into accurate Knowledge Graphs (KGs) by proposing GraphJudge, a framework that uses an entity-centric strategy and a fine-tuned LLM as a graph judge, achieving state-of-the-art performance on two general and one domain-specific datasets.
In real-world scenarios, most of the data obtained from the information retrieval (IR) system is unstructured. Converting natural language sentences into structured Knowledge Graphs (KGs) remains a critical challenge. We identified three limitations with respect to existing KG construction methods: (1) There could be a large amount of noise in real-world documents, which could result in extracting messy information. (2) Naive LLMs usually extract inaccurate knowledge from some domain-specific documents. (3) Hallucination phenomenon cannot be overlooked when directly using LLMs to construct KGs. In this paper, we propose \textbf{GraphJudge}, a KG construction framework to address the aforementioned challenges. In this framework, we designed an entity-centric strategy to eliminate the noise information in the documents. And we fine-tuned a LLM as a graph judge to finally enhance the quality of generated KGs. Experiments conducted on two general and one domain-specific text-graph pair datasets demonstrate state-of-the-art performance against various baseline methods with strong generalization abilities. Our code is available at \href{https://github.com/hhy-huang/GraphJudge}{https://github.com/hhy-huang/GraphJudge}.