LGAIFeb 23

Decision MetaMamba: Enhancing Selective SSM in Offline RL with Heterogeneous Sequence Mixing

arXiv:2602.19805v1
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in offline RL for researchers and practitioners, offering an incremental improvement over existing Mamba models.

The paper tackles the problem of information loss in Mamba-based models for offline reinforcement learning due to selective scanning, proposing Decision MetaMamba (DMM) which replaces the token mixer with a dense layer-based sequence mixer to preserve local information, resulting in state-of-the-art performance across diverse RL tasks with a compact parameter footprint.

Mamba-based models have drawn much attention in offline RL. However, their selective mechanism often detrimental when key steps in RL sequences are omitted. To address these issues, we propose a simple yet effective structure, called Decision MetaMamba (DMM), which replaces Mamba's token mixer with a dense layer-based sequence mixer and modifies positional structure to preserve local information. By performing sequence mixing that considers all channels simultaneously before Mamba, DMM prevents information loss due to selective scanning and residual gating. Extensive experiments demonstrate that our DMM delivers the state-of-the-art performance across diverse RL tasks. Furthermore, DMM achieves these results with a compact parameter footprint, demonstrating strong potential for real-world applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes