CLMay 19

KoRe: Compact Knowledge Representations for Large Language Models

Davide Cavicchini, Fausto Giunchiglia, Jacopo Staiano

arXiv:2605.2017024.3

AI Analysis

For LLM practitioners, KoRe offers a more efficient and editable way to inject world knowledge without extensive retraining.

KoRe encodes 1-hop knowledge graph sub-graphs into compact discrete tokens to ground LLMs, achieving competitive performance on three benchmarks while reducing token usage by up to 10x.

Modern Large Language Models (LLMs) have shown impressive performances in user-facing tasks such as question answering, as well as consistent improvements in reasoning capabilities. Still, the way these models encode knowledge seems inherently flawed: by design, LLMs encode world-knowledge within their parameters. This way of representing knowledge is inherently opaque, difficult to debug and update, and prone to hallucinations. On the other hand, Knowledge Graphs can provide human-readable and easily editable world knowledge representations, and their application in knowledge-intensive tasks has consistently proven beneficial to downstream performance. Nonetheless, current integration techniques require extensive retraining or finetuning. To overcome this issue, we introduce KoRe, a methodology to encode 1-hop sub-graphs into compact discrete knowledge tokens and inject them into a LLM backbone. We test the proposed approach on three established benchmarks, and report competitive performances coupled with a significant reduction (up to 10x) in token usage. Our results show that compact discrete KG representations can efficiently and effectively be used to ground modern LLMs.

View on arXiv PDF

Similar