CLLGFeb 27, 2025

Finite State Automata Inside Transformers with Chain-of-Thought: A Mechanistic Study on State Tracking

arXiv:2502.20129v311 citationsh-index: 6Has CodeACL
Originality Incremental advance
AI Analysis

This provides mechanistic insights into CoT's effectiveness for researchers, though it is incremental as it builds on prior theoretical work.

The study investigates how Transformers with chain-of-thought (CoT) learn state tracking algorithms, finding that they implicitly embed finite state automata (FSAs) with nearly 100% accuracy in neuron sets, and demonstrate resilience in challenging scenarios like noise and length generalization.

Chain-of-thought (CoT) significantly enhances the performance of large language models (LLMs) across a wide range of tasks, and prior research shows that CoT can theoretically increase expressiveness. However, there is limited mechanistic understanding of the algorithms that Transformer+CoT can learn. Our key contributions are: (1) We evaluate the state tracking capabilities of Transformer+CoT and its variants, confirming the effectiveness of CoT. (2) Next, we identify the circuit (a subset of model components, responsible for tracking the world state), indicating that late-layer MLP neurons play a key role. We propose two metrics, compression and distinction, and show that the neuron sets for each state achieve nearly 100% accuracy, providing evidence of an implicit finite state automaton (FSA) embedded within the model. (3) Additionally, we explore three challenging settings: skipping intermediate steps, introducing data noises, and testing length generalization. Our results demonstrate that Transformer+CoT learns robust algorithms (FSAs), highlighting its resilience in challenging scenarios. Our code is available at https://github.com/IvanChangPKU/FSA.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes