LGAIFeb 23, 2024

Efficient State Space Model via Fast Tensor Convolution and Block Diagonalization

arXiv:2402.15290v41 citationsh-index: 1Has Code
Originality Incremental advance
AI Analysis

This work addresses efficiency bottlenecks in sequence modeling for AI applications, offering incremental improvements over existing state space models.

The paper tackles the problem of balancing performance and computational efficiency in state space models for long sequences by proposing an efficient SSM (eSSM) that uses fast tensor convolution and block diagonalization, achieving performance matching S4 with parameters reduced to 13.24% of Mamba and training speed 1.35 times faster than Mamba.

Existing models encounter bottlenecks in balancing performance and computational efficiency when modeling long sequences. Although the state space model (SSM) has achieved remarkable success in handling long sequence tasks, it still faces the problem of large number of parameters. In order to further improve the efficiency of SSM, we propose a new state space layer based on multiple-input multiple-output SSM, called efficient SSM (eSSM). Our eSSM is built on the convolutional representation of multi-input and multi-input (MIMO) SSM. We propose a variety of effective strategies to improve the computational efficiency. The diagonalization of the system matrix first decouples the original system. Then a fast tensor convolution is proposed based on the fast Fourier transform. In addition, the block diagonalization of the SSM further reduces the model parameters and improves the model flexibility. Extensive experimental results show that the performance of the proposed model on multiple databases matches the performance of state-of-the-art models, such as S4, and is significantly better than Transformers and LSTM. In the model efficiency benchmark, the parameters of eSSM are only 12.89\% of LSTM and 13.24\% of Mamba. The training speed of eSSM is 3.94 times faster than LSTM and 1.35 times faster than Mamba. Code is available at: \href{https://github.com/leonty1/essm}{https://github.com/leonty1/essm}.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes