CLLGJun 27, 2025

Identifying a Circuit for Verb Conjugation in GPT-2

arXiv:2506.22105v1h-index: 1
Originality Incremental advance
AI Analysis

This addresses interpretability for language models, focusing on verb conjugation, but is incremental as it builds on existing circuit analysis methods.

The study isolated a sub-network in GPT-2 Small responsible for subject-verb agreement, finding that a small fraction of components achieves near-model performance on simple tasks but requires more for complex settings.

I implement a procedure to isolate and interpret the sub-network (or "circuit") responsible for subject-verb agreement in GPT-2 Small. In this study, the model is given prompts where the subject is either singular (e.g. "Alice") or plural (e.g. "Alice and Bob"), and the task is to correctly predict the appropriate verb form ("walks" for singular subjects, "walk" for plural subjects). Using a series of techniques-including performance verification automatic circuit discovery via direct path patching, and direct logit attribution- I isolate a candidate circuit that contributes significantly to the model's correct verb conjugation. The results suggest that only a small fraction of the network's component-token pairs is needed to achieve near-model performance on the base task but substantially more for more complex settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes