LOCLFLLGApr 5, 2024

Counting Like Transformers: Compiling Temporal Counting Logic Into Softmax Transformers

arXiv:2404.04393v238 citationsh-index: 5
Originality Highly original
AI Analysis

This work provides foundational insights into transformer expressivity, benefiting researchers in theoretical machine learning and formal methods.

The paper tackled the problem of understanding the computational power of transformers by introducing temporal counting logic K_t[#] and its RASP variant C-RASP, showing they are equivalent and can be compiled into future-masked soft attention transformers, establishing the best-known lower bound on expressivity for such transformers with unbounded input size.

Deriving formal bounds on the expressivity of transformers, as well as studying transformers that are constructed to implement known algorithms, are both effective methods for better understanding the computational power of transformers. Towards both ends, we introduce the temporal counting logic $\textsf{K}_\text{t}$[#] alongside the RASP variant $\textsf{C-RASP}$. We show they are equivalent to each other, and that together they are the best-known lower bound on the formal expressivity of future-masked soft attention transformers with unbounded input size. We prove this by showing all $\textsf{K}_\text{t}$[#] formulas can be compiled into these transformers.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes