LGNov 13, 2025

Transformers know more than they can tell -- Learning the Collatz sequence

arXiv:2511.10811v11 citationsh-index: 9
Originality Incremental advance
AI Analysis

This work addresses the challenge of understanding how language models learn complex arithmetic functions, offering insights into their internal algorithms, though it is incremental in applying known methods to a specific mathematical problem.

The paper investigates transformer models' ability to predict long Collatz sequence steps, finding accuracy up to 99.7% with specific bases like 24 and 32, but as low as 25% with others, and reveals that models learn to classify inputs based on residual modulo patterns, with errors primarily due to misestimating loop lengths rather than hallucinations.

We investigate transformer prediction of long Collatz steps, a complex arithmetic function that maps odd integers to their distant successors in the Collatz sequence ( $u_{n+1}=u_n/2$ if $u_n$ is even, $u_{n+1}=(3u_n+1)/2$ if $u_n$ is odd). Model accuracy varies with the base used to encode input and output. It can be as high as $99.7\%$ for bases $24$ and $32$, and as low as $37$ and $25\%$ for bases $11$ and $3$. Yet, all models, no matter the base, follow a common learning pattern. As training proceeds, they learn a sequence of classes of inputs that share the same residual modulo $2^p$. Models achieve near-perfect accuracy on these classes, and less than $1\%$ for all other inputs. This maps to a mathematical property of Collatz sequences: the length of the loops involved in the computation of a long Collatz step can be deduced from the binary representation of its input. The learning pattern reflects the model learning to predict inputs associated with increasing loop lengths. An analysis of failure cases reveals that almost all model errors follow predictable patterns. Hallucination, a common feature of large language models, almost never happens. In over $90\%$ of failures, the model performs the correct calculation, but wrongly estimates loop lengths. Our observations give a full account of the algorithms learned by the models. They suggest that the difficulty of learning such complex arithmetic function lies in figuring the control structure of the computation -- the length of the loops. We believe that the approach outlined here, using mathematical problems as tools for understanding, explaining, and perhaps improving language models, can be applied to a broad range of problems and bear fruitful results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes