CL AIJan 28, 2024

Contextualization Distillation from Large Language Model for Knowledge Graph Completion

Dawei Li, Zhen Tan, Tianlong Chen, Huan Liu

arXiv:2402.01729v327.6113 citationsh-index: 14Has CodeFindings

Originality Incremental advance

AI Analysis

This work addresses the problem of enhancing KGC models for researchers and practitioners by providing a plug-and-play method that improves accuracy and explainability, though it is incremental as it builds on existing KGC frameworks.

The paper tackles the limitations of static and noisy textual corpora in knowledge graph completion (KGC) by introducing a Contextualization Distillation strategy that uses large language models to enrich triplets and trains smaller models via auxiliary tasks, achieving consistent performance improvements across diverse datasets and KGC techniques.

While textual information significantly enhances the performance of pre-trained language models (PLMs) in knowledge graph completion (KGC), the static and noisy nature of existing corpora collected from Wikipedia articles or synsets definitions often limits the potential of PLM-based KGC models. To surmount these challenges, we introduce the Contextualization Distillation strategy, a versatile plug-in-and-play approach compatible with both discriminative and generative KGC frameworks. Our method begins by instructing large language models (LLMs) to transform compact, structural triplets into context-rich segments. Subsequently, we introduce two tailored auxiliary tasks, reconstruction and contextualization, allowing smaller KGC models to assimilate insights from these enriched triplets. Comprehensive evaluations across diverse datasets and KGC techniques highlight the efficacy and adaptability of our approach, revealing consistent performance enhancements irrespective of underlying pipelines or architectures. Moreover, our analysis makes our method more explainable and provides insight into generating path selection, as well as the choosing of suitable distillation tasks. All the code and data in this work will be released at https://github.com/David-Li0406/Contextulization-Distillation

View on arXiv PDF Code

Similar