LGAICLOct 26, 2023

How do Language Models Bind Entities in Context?

arXiv:2310.17191v288 citationsh-index: 3
Originality Incremental advance
AI Analysis

This work addresses the challenge of in-context reasoning for AI researchers, providing interpretable insights into symbolic knowledge representation, but it is incremental as it builds on existing understanding of LM mechanisms.

The paper tackled the problem of how language models bind entities to their attributes in context, such as linking shapes to colors, and identified a general binding ID mechanism present in large models like Pythia and LLaMA, showing that internal activations use binding ID vectors in a continuous subspace to represent this information.

To correctly use in-context information, language models (LMs) must bind entities to their attributes. For example, given a context describing a "green square" and a "blue circle", LMs must bind the shapes to their respective colors. We analyze LM representations and identify the binding ID mechanism: a general mechanism for solving the binding problem, which we observe in every sufficiently large model from the Pythia and LLaMA families. Using causal interventions, we show that LMs' internal activations represent binding information by attaching binding ID vectors to corresponding entities and attributes. We further show that binding ID vectors form a continuous subspace, in which distances between binding ID vectors reflect their discernability. Overall, our results uncover interpretable strategies in LMs for representing symbolic knowledge in-context, providing a step towards understanding general in-context reasoning in large-scale LMs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes