GenIC: An LLM-Based Framework for Instance Completion in Knowledge Graphs
This work addresses knowledge gaps in knowledge bases for applications like search and recommendation, but it is incremental as it builds on existing LLM and knowledge graph methods.
The authors tackled the problem of knowledge graph instance completion, where only the head entity is known, by proposing GenIC, an LLM-based framework that uses entity descriptions and types to predict relation-tail pairs, and it outperformed existing baselines on three datasets.
Knowledge graph completion aims to address the gaps of knowledge bases by adding new triples that represent facts. The complexity of this task depends on how many parts of a triple are already known. Instance completion involves predicting the relation-tail pair when only the head is given (h, ?, ?). Notably, modern knowledge bases often contain entity descriptions and types, which can provide valuable context for inferring missing facts. By leveraging these textual descriptions and the ability of large language models to extract facts from them and recognize patterns within the knowledge graph schema, we propose an LLM-powered, end-to-end instance completion approach. Specifically, we introduce GenIC: a two-step Generative Instance Completion framework. The first step focuses on property prediction, treated as a multi-label classification task. The second step is link prediction, framed as a generative sequence-to-sequence task. Experimental results on three datasets show that our method outperforms existing baselines. Our code is available at https://github.com/amal-gader/genic.