GS-KGC: A Generative Subgraph-based Framework for Knowledge Graph Completion with Large Language Models
This addresses the problem of incomplete knowledge graphs for applications relying on structured data, offering an incremental improvement by incorporating subgraph context into LLM-based methods.
The paper tackles knowledge graph completion by proposing GS-KGC, a framework that uses subgraph information with large language models to generate missing triples, resulting in a 5.6% increase in Hits@3 on FB15k-237N and a 9.3% increase on ICEWS14 compared to prior LLM-based methods.
Knowledge graph completion (KGC) focuses on identifying missing triples in a knowledge graph (KG) , which is crucial for many downstream applications. Given the rapid development of large language models (LLMs), some LLM-based methods are proposed for KGC task. However, most of them focus on prompt engineering while overlooking the fact that finer-grained subgraph information can aid LLMs in generating more accurate answers. In this paper, we propose a novel completion framework called \textbf{G}enerative \textbf{S}ubgraph-based KGC (GS-KGC), which utilizes subgraph information as contextual reasoning and employs a QA approach to achieve the KGC task. This framework primarily includes a subgraph partitioning algorithm designed to generate negatives and neighbors. Specifically, negatives can encourage LLMs to generate a broader range of answers, while neighbors provide additional contextual insights for LLM reasoning. Furthermore, we found that GS-KGC can discover potential triples within the KGs and new facts beyond the KGs. Experiments conducted on four common KGC datasets highlight the advantages of the proposed GS-KGC, e.g., it shows a 5.6\% increase in Hits@3 compared to the LLM-based model CP-KGC on the FB15k-237N, and a 9.3\% increase over the LLM-based model TECHS on the ICEWS14.