On Relation-Specific Neurons in Large Language Models
This work addresses the interpretability of knowledge storage in LLMs for researchers, providing incremental insights into neuron specialization.
The study investigates whether large language models contain neurons that specifically detect relations independent of entities, using statistical methods on the LLama-2 family, and demonstrates their existence with properties like cumulativity, versatility, and interference, showing that deactivating relation-specific neurons can improve recall for other relations.
In large language models (LLMs), certain \emph{neurons} can store distinct pieces of knowledge learned during pretraining. While factual knowledge typically appears as a combination of \emph{relations} and \emph{entities}, it remains unclear whether some neurons focus on a relation itself -- independent of any entity. We hypothesize such neurons \emph{detect} a relation in the input text and \emph{guide} generation involving such a relation. To investigate this, we study the LLama-2 family on a chosen set of relations, with a \textit{statistics}-based method. Our experiments demonstrate the existence of relation-specific neurons. We measure the effect of selectively deactivating candidate neurons specific to relation $r$ on the LLM's ability to handle (1) facts involving relation $r$ and (2) facts involving a different relation $r' \neq r$. With respect to their capacity for encoding relation information, we give evidence for the following three properties of relation-specific neurons. \textbf{(i) Neuron cumulativity.} Multiple neurons jointly contribute to processing facts involving relation $r$, with no single neuron fully encoding a fact in $r$ on its own. \textbf{(ii) Neuron versatility.} Neurons can be shared across multiple closely related as well as less related relations. In addition, some relation neurons transfer across languages. \textbf{(iii) Neuron interference.} Deactivating neurons specific to one relation can improve LLMs' factual recall performance for facts of other relations. We make our code and data publicly available at https://github.com/cisnlp/relation-specific-neurons.