SEAICLFeb 20, 2025

Mechanistic Understanding of Language Models in Syntactic Code Completion

arXiv:2502.18499v12 citationsh-index: 5
Originality Incremental advance
AI Analysis

This work addresses the limited understanding of internal decision-making in Code LMs, which is important for developers and researchers to mitigate unintended harms as these models are increasingly deployed in real-life applications, though it is incremental in focusing on a specific syntactic task.

The study investigated how Code LMs, specifically CodeLlama-7b, make decisions in syntactic code completion tasks like closing parentheses, finding that middle-later layers are crucial for confident predictions and that multi-head attention plays a key role, with some attention heads tracking closed parentheses but not always promoting correct missing ones.

Recently, language models (LMs) have shown impressive proficiency in code generation tasks, especially when fine-tuned on code-specific datasets, commonly known as Code LMs. However, our understanding of the internal decision-making processes of Code LMs, such as how they use their (syntactic or semantic) knowledge, remains limited, which could lead to unintended harm as they are increasingly used in real life. This motivates us to conduct one of the first Mechanistic Interpretability works to understand how Code LMs perform a syntactic completion task, specifically the closing parenthesis task, on the CodeLlama-7b model (Roziere et al. 2023). Our findings reveal that the model requires middle-later layers until it can confidently predict the correct label for the closing parenthesis task. Additionally, we identify that while both multi-head attention (MHA) and feed-forward (FF) sub-layers play essential roles, MHA is particularly crucial. Furthermore, we also discover attention heads that keep track of the number of already closed parentheses precisely but may or may not promote a correct number of closing parentheses that are still missing, leading to a positive or negative impact on the model's performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes