LG CLDec 10, 2025

Are Hypervectors Enough? Single-Call LLM Reasoning over Knowledge Graphs

Yezi Liu, William Youngwoo Chung, Hanning Chen, Calvin Yeung, Mohsen Imani

arXiv:2512.09369v14.1h-index: 9

Originality Incremental advance

AI Analysis

This addresses efficiency and interpretability issues for deploying KG-LLM systems, offering a practical trade-off, though it is incremental in combining existing techniques.

The paper tackles the problem of high latency and cost in knowledge graph reasoning with large language models by proposing PathHD, a framework that uses hyperdimensional computing and a single LLM call per query, achieving comparable or better Hits@1 while reducing latency by 40-60% and GPU memory by 3-5x.

Recent advances in large language models (LLMs) have enabled strong reasoning over both structured and unstructured knowledge. When grounded on knowledge graphs (KGs), however, prevailing pipelines rely on heavy neural encoders to embed and score symbolic paths or on repeated LLM calls to rank candidates, leading to high latency, GPU cost, and opaque decisions that hinder faithful, scalable deployment. We propose PathHD, a lightweight and encoder-free KG reasoning framework that replaces neural path scoring with hyperdimensional computing (HDC) and uses only a single LLM call per query. PathHD encodes relation paths into block-diagonal GHRR hypervectors, ranks candidates with blockwise cosine similarity and Top-K pruning, and then performs a one-shot LLM adjudication to produce the final answer together with cited supporting paths. Technically, PathHD is built on three ingredients: (i) an order-aware, non-commutative binding operator for path composition, (ii) a calibrated similarity for robust hypervector-based retrieval, and (iii) a one-shot adjudication step that preserves interpretability while eliminating per-path LLM scoring. On WebQSP, CWQ, and the GrailQA split, PathHD (i) attains comparable or better Hits@1 than strong neural baselines while using one LLM call per query; (ii) reduces end-to-end latency by $40-60\%$ and GPU memory by $3-5\times$ thanks to encoder-free retrieval; and (iii) delivers faithful, path-grounded rationales that improve error diagnosis and controllability. These results indicate that carefully designed HDC representations provide a practical substrate for efficient KG-LLM reasoning, offering a favorable accuracy-efficiency-interpretability trade-off.

View on arXiv PDF

Similar