CLAIOct 20, 2025

AtlasKV: Augmenting LLMs with Billion-Scale Knowledge Graphs in 20GB VRAM

arXiv:2510.17934v12 citationsh-index: 14
Originality Highly original
AI Analysis

This addresses the challenge of high latency and memory costs in retrieval-augmented generation for LLMs, offering a scalable solution for knowledge integration.

The paper tackles the problem of augmenting large language models with billion-scale knowledge graphs efficiently, achieving this with less than 20GB VRAM and sub-linear time complexity while maintaining strong performance without external retrievers.

Retrieval-augmented generation (RAG) has shown some success in augmenting large language models (LLMs) with external knowledge. However, as a non-parametric knowledge integration paradigm for LLMs, RAG methods heavily rely on external retrieval modules and the retrieved textual context prior. Especially for very large scale knowledge augmentation, they would introduce substantial inference latency due to expensive searches and much longer relevant context. In this paper, we propose a parametric knowledge integration method, called \textbf{AtlasKV}, a scalable, effective, and general way to augment LLMs with billion-scale knowledge graphs (KGs) (e.g. 1B triples) using very little GPU memory cost (e.g. less than 20GB VRAM). In AtlasKV, we introduce KG2KV and HiKVP to integrate KG triples into LLMs at scale with sub-linear time and memory complexity. It maintains strong knowledge grounding and generalization performance using the LLMs' inherent attention mechanism, and requires no external retrievers, long context priors, or retraining when adapting to new knowledge.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes