MMA: A Momentum Mamba Architecture for Human Activity Recognition with Inertial Sensors
This work addresses limitations in deep models for human activity recognition, offering a scalable paradigm with incremental improvements for applications in ubiquitous computing and mobile health.
The paper tackled the problem of human activity recognition from inertial sensors by introducing Momentum Mamba, a momentum-augmented structured state-space model that improves stability and long-sequence modeling, resulting in consistent gains in accuracy, robustness, and convergence speed over baselines on multiple benchmarks.
Human activity recognition (HAR) from inertial sensors is essential for ubiquitous computing, mobile health, and ambient intelligence. Conventional deep models such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and transformers have advanced HAR but remain limited by vanishing or exloding gradients, high computational cost, and difficulty in capturing long-range dependencies. Structured state-space models (SSMs) like Mamba address these challenges with linear complexity and effective temporal modeling, yet they are restricted to first-order dynamics without stable longterm memory mechanisms. We introduce Momentum Mamba, a momentum-augmented SSM that incorporates second-order dynamics to improve stability of information flow across time steps, robustness, and long-sequence modeling. Two extensions further expand its capacity: Complex Momentum Mamba for frequency-selective memory scaling. Experiments on multiple HAR benchmarks demonstrate consistent gains over vanilla Mamba and Transformer baselines in accuracy, robustness, and convergence speed. With only moderate increases in training cost, momentum-augmented SSMs offer a favorable accuracy-efficiency balance, establishing them as a scalable paradigm for HAR and a promising principal framework for broader sequence modeling applications.