CLApr 14

KoCo: Conditioning Language Model Pre-training on Knowledge Coordinates

arXiv:2604.1239780.9h-index: 7

Predicted impact top 67% in CL · last 90 daysOriginality Incremental advance

AI Analysis

For LLM practitioners, KoCo offers a simple plug-in to enhance pre-training efficiency and output reliability.

KoCo introduces a method to condition LLM pre-training on three-dimensional semantic coordinates, improving performance across 10 downstream tasks and accelerating convergence by ~30% while reducing hallucination.

Standard Large Language Model (LLM) pre-training typically treats corpora as flattened token sequences, often overlooking the real-world context that humans naturally rely on to contextualize information. To bridge this gap, we introduce Knowledge Coordinate Conditioning (KoCo), a simple method that maps every document into a three-dimensional semantic coordinate. By prepending these coordinates as textual prefixes for pre-training, we aim to equip the model with explicit contextual awareness to learn the documents within the real-world knowledge structure. Experiment results demonstrate that KoCo significantly enhances performance across 10 downstream tasks and accelerates pre-training convergence by approximately 30\%. Furthermore, our analysis indicates that explicitly modeling knowledge coordinates helps the model distinguish stable facts from noise, effectively mitigating hallucination in generated outputs.

View on arXiv PDF

Similar