CC AI CL LGDec 9, 2024

The Computational Limits of State-Space Models and Mamba via the Lens of Circuit Complexity

Yifang Chen, Xiaoyu Li, Yingyu Liang, Zhenmei Shi, Zhao Song

arXiv:2412.06148v217.028 citationsh-index: 21CPAL

Originality Incremental advance

AI Analysis

This work challenges the assumption that Mamba is more computationally expressive than Transformers, providing theoretical insights for researchers in machine learning and computational complexity.

The paper analyzes the computational limitations of Mamba and State-space Models (SSMs) using circuit complexity, demonstrating that with poly(n)-precision and constant-depth layers, they reside within the DLOGTIME-uniform TC^0 class, indicating they have the same theoretical computational capabilities as Transformers and cannot solve problems like arithmetic formula problems if TC^0 ≠ NC^1.

In this paper, we analyze the computational limitations of Mamba and State-space Models (SSMs) by using the circuit complexity framework. Despite Mamba's stateful design and recent attention as a strong candidate to outperform Transformers, we have demonstrated that both Mamba and SSMs with $\mathrm{poly}(n)$-precision and constant-depth layers reside within the $\mathsf{DLOGTIME}$-uniform $\mathsf{TC}^0$ complexity class. This result indicates Mamba has the same computational capabilities as Transformer theoretically, and it cannot solve problems like arithmetic formula problems, boolean formula value problems, and permutation composition problems if $\mathsf{TC}^0 \neq \mathsf{NC}^1$. Therefore, it challenges the assumption Mamba is more computationally expressive than Transformers. Our contributions include rigorous proofs showing that Selective SSM and Mamba architectures can be simulated by $\mathsf{DLOGTIME}$-uniform $\mathsf{TC}^0$ circuits, and they cannot solve problems outside $\mathsf{TC}^0$.

View on arXiv PDF

Similar