LGDec 11, 2023

Spectral State Space Models

Naman Agarwal, Daniel Suo, Xinyi Chen, Elad Hazan

DeepMindPrinceton

arXiv:2312.06837v418.421 citationsh-index: 64

Originality Highly original

AI Analysis

This addresses the problem of long-range prediction in sequence modeling for researchers and practitioners, offering a novel method with theoretical guarantees.

The paper tackles sequence modeling for long-range dependencies by proposing spectral state space models, which use fixed convolutional filters and achieve provable robustness, outperforming existing state space models in theory and practice on synthetic and real-world tasks.

This paper studies sequence modeling for prediction tasks with long range dependencies. We propose a new formulation for state space models (SSMs) based on learning linear dynamical systems with the spectral filtering algorithm (Hazan et al. (2017)). This gives rise to a novel sequence prediction architecture we call a spectral state space model. Spectral state space models have two primary advantages. First, they have provable robustness properties as their performance depends on neither the spectrum of the underlying dynamics nor the dimensionality of the problem. Second, these models are constructed with fixed convolutional filters that do not require learning while still outperforming SSMs in both theory and practice. The resulting models are evaluated on synthetic dynamical systems and long-range prediction tasks of various modalities. These evaluations support the theoretical benefits of spectral filtering for tasks requiring very long range memory.

View on arXiv PDF

Similar