CLLGJan 13

Relational Knowledge Distillation Using Fine-tuned Function Vectors

arXiv:2601.08169v1
Originality Incremental advance
AI Analysis

This work addresses the challenge of encoding and manipulating relational knowledge in AI systems, offering incremental improvements in interpretability and reasoning for large language models.

The researchers tackled the problem of representing relational knowledge in language models by fine-tuning function vectors with minimal examples, achieving better performance on relation-based word-completion tasks and improved analogy solving on cognitive science and SAT benchmarks.

Representing relations between concepts is a core prerequisite for intelligent systems to make sense of the world. Recent work using causal mediation analysis has shown that a small set of attention heads encodes task representation in in-context learning, captured in a compact representation known as the function vector. We show that fine-tuning function vectors with only a small set of examples (about 20 word pairs) yields better performance on relation-based word-completion tasks than using the original vectors derived from causal mediation analysis. These improvements hold for both small and large language models. Moreover, the fine-tuned function vectors yield improved decoding performance for relation words and show stronger alignment with human similarity judgments of semantic relations. Next, we introduce the composite function vector - a weighted combination of fine-tuned function vectors - to extract relational knowledge and support analogical reasoning. At inference time, inserting this composite vector into LLM activations markedly enhances performance on challenging analogy problems drawn from cognitive science and SAT benchmarks. Our results highlight the potential of activation patching as a controllable mechanism for encoding and manipulating relational knowledge, advancing both the interpretability and reasoning capabilities of large language models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes