CLAIMay 23, 2024

Knowledge Localization: Mission Not Accomplished? Enter Query Localization!

arXiv:2405.14117v212 citationsh-index: 28Has CodeICLR
Originality Incremental advance
AI Analysis

This work addresses a foundational problem in understanding LLM mechanisms for researchers, but it is incremental as it builds on existing theories.

The paper challenges the Knowledge Localization assumption in LLMs by identifying its limitations in knowledge storage and expression, and proposes the Query Localization assumption, which improves knowledge modification performance as validated through 39 experimental sets.

Large language models (LLMs) store extensive factual knowledge, but the mechanisms behind how they store and express this knowledge remain unclear. The Knowledge Neuron (KN) thesis is a prominent theory for explaining these mechanisms. This theory is based on the Knowledge Localization (KL) assumption, which suggests that a fact can be localized to a few knowledge storage units, namely knowledge neurons. However, this assumption has two limitations: first, it may be too rigid regarding knowledge storage, and second, it neglects the role of the attention module in knowledge expression. In this paper, we first re-examine the KL assumption and demonstrate that its limitations do indeed exist. To address these, we then present two new findings, each targeting one of the limitations: one focusing on knowledge storage and the other on knowledge expression. We summarize these findings as \textbf{Query Localization} (QL) assumption and argue that the KL assumption can be viewed as a simplification of the QL assumption. Based on QL assumption, we further propose the Consistency-Aware KN modification method, which improves the performance of knowledge modification, further validating our new assumption. We conduct 39 sets of experiments, along with additional visualization experiments, to rigorously confirm our conclusions. Code is available at https://github.com/heng840/KnowledgeLocalization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes