LGAICRJun 30, 2025

Learning Modular Exponentiation with Transformers

arXiv:2506.23679v20.00h-index: 1
AI Analysis50

This work addresses the problem of understanding how neural networks learn modular arithmetic, which is crucial for cryptography, but it is incremental in mechanistic interpretability.

The researchers trained a Transformer model to perform modular exponentiation and found that reciprocal operand training led to strong performance gains with sudden generalization across moduli, reflecting grokking-like dynamics and specialized computational circuits.

Modular exponentiation is crucial to number theory and cryptography, yet remains largely unexplored from a mechanistic interpretability standpoint. We train a 4-layer encoder-decoder Transformer model to perform this operation and investigate the emergence of numerical reasoning during training. Utilizing principled sampling strategies, PCA-based embedding analysis, and activation patching, we examine how number-theoretic properties are encoded within the model. We find that reciprocal operand training leads to strong performance gains, with sudden generalization across related moduli. These synchronized accuracy surges reflect grokking-like dynamics, suggesting the model internalizes shared arithmetic structure. We also find a subgraph consisting entirely of attention heads in the final layer sufficient to achieve full performance on the task of regular exponentiation. These results suggest that transformer models learn modular arithmetic through specialized computational circuits, paving the way for more interpretable and efficient neural approaches to modular exponentiation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes