CLAIJan 14

Understanding or Memorizing? A Case Study of German Definite Articles in Language Models

arXiv:2601.09313v11 citationsh-index: 2
Originality Incremental advance
AI Analysis

This addresses the interpretability of language models for linguists and AI researchers, but it is incremental as it builds on existing gradient-based methods.

The study investigated whether language models generalize grammatical rules or memorize patterns, focusing on German definite articles, and found that models rely on memorized associations rather than abstract rules, with substantial overlap in affected neurons across different gender-case settings.

Language models perform well on grammatical agreement, but it is unclear whether this reflects rule-based generalization or memorization. We study this question for German definite singular articles, whose forms depend on gender and case. Using GRADIEND, a gradient-based interpretability method, we learn parameter update directions for gender-case specific article transitions. We find that updates learned for a specific gender-case article transition frequently affect unrelated gender-case settings, with substantial overlap among the most affected neurons across settings. These results argue against a strictly rule-based encoding of German definite articles, indicating that models at least partly rely on memorized associations rather than abstract grammatical rules.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes