The Expressive Limits of Diagonal SSMs for State-Tracking
This work addresses a theoretical gap in machine learning for researchers studying SSMs, providing foundational insights into their limitations, though it is incremental in advancing theoretical understanding rather than empirical performance.
The paper tackled the problem of understanding the expressive power of diagonal State-Space Models (SSMs) on sequential state-tracking tasks, showing that single-layer models cannot express non-Abelian group tracking at finite precision and identifying precise expressivity limits for multi-layer models within solvable groups.
State-Space Models (SSMs) have recently been shown to achieve strong empirical performance on a variety of long-range sequence modeling tasks while remaining efficient and highly-parallelizable. However, the theoretical understanding of their expressive power remains limited. In this work, we study the expressivity of input-Dependent Complex-valued Diagonal (DCD) SSMs on sequential state-tracking tasks. We show that single-layer DCD SSMs cannot express state-tracking of any non-Abelian group at finite precision. More generally, we show that $k$-layer DCD SSMs can express state-tracking of a group if and only if that group has a subnormal series of length $k$, with Abelian factors. That is, we identify the precise expressivity range of $k$-layer DCD SSMs within the solvable groups. Empirically, we find that multi-layer models often fail to learn state-tracking for non-Abelian groups, highlighting a gap between expressivity and learnability.