LGAIMar 30

On the Mirage of Long-Range Dependency, with an Application to Integer Multiplication

arXiv:2603.2906953.6h-index: 3
Predicted impact top 45% in LG · last 90 daysOriginality Highly original
AI Analysis

This work challenges the common assumption that long-range dependency is inherent to certain tasks, offering a new perspective for designing neural architectures and representations.

The authors argue that long-range dependency in integer multiplication is a mirage caused by the choice of representation, not an intrinsic property. By using a 2D outer-product grid representation, a neural cellular automaton with 321 parameters achieves perfect length generalization up to 683× the training range, while Transformers and Mamba fail.

Integer multiplication has long been considered a hard problem for neural networks, with the difficulty widely attributed to the O(n) long-range dependency induced by carry chains. We argue that this diagnosis is wrong: long-range dependency is not an intrinsic property of multiplication, but a mirage produced by the choice of computational spacetime. We formalize the notion of mirage and provide a constructive proof: when two n-bit binary integers are laid out as a 2D outer-product grid, every step of long multiplication collapses into a $3 \times 3$ local neighborhood operation. Under this representation, a neural cellular automaton with only 321 learnable parameters achieves perfect length generalization up to $683\times$ the training range. Five alternative architectures -- including Transformer (6,625 params), Transformer+RoPE, and Mamba -- all fail under the same representation. We further analyze how partial successes locked the community into an incorrect diagnosis, and argue that any task diagnosed as requiring long-range dependency should first be examined for whether the dependency is intrinsic to the task or induced by the computational spacetime.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes