CVFeb 8, 2024

Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data

arXiv:2402.05892v5124 citationsh-index: 7ECCV
Originality Incremental advance
AI Analysis

This work addresses computational bottlenecks for researchers and practitioners working with multi-dimensional data like images, video, and weather data, though it is an incremental extension of an existing method.

The authors tackled the problem of prohibitive computational complexity in Transformers for multi-dimensional data by extending the Mamba architecture to arbitrary dimensions, achieving competitive performance with state-of-the-art methods on benchmarks like ImageNet-1K classification, HMDB-51 action recognition, and ERA5 weather forecasting.

In recent years, Transformers have become the de-facto architecture for sequence modeling on text and a variety of multi-dimensional data, such as images and video. However, the use of self-attention layers in a Transformer incurs prohibitive compute and memory complexity that scales quadratically w.r.t. the sequence length. A recent architecture, Mamba, based on state space models has been shown to achieve comparable performance for modeling text sequences, while scaling linearly with the sequence length. In this work, we present Mamba-ND, a generalized design extending the Mamba architecture to arbitrary multi-dimensional data. Our design alternatively unravels the input data across different dimensions following row-major orderings. We provide a systematic comparison of Mamba-ND with several other alternatives, based on prior multi-dimensional extensions such as Bi-directional LSTMs and S4ND. Empirically, we show that Mamba-ND demonstrates performance competitive with the state-of-the-art on a variety of multi-dimensional benchmarks, including ImageNet-1K classification, HMDB-51 action recognition, and ERA5 weather forecasting.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes