AMMSM: Adaptive Motion Magnification and Sparse Mamba for Micro-Expression Recognition
This work addresses the challenge of recognizing subtle, short-duration micro-expressions for applications in emotion analysis, representing an incremental improvement with specific gains.
The paper tackles micro-expression recognition by proposing AMMSM, a multi-task learning framework that uses adaptive motion magnification and a sparse Mamba architecture, achieving state-of-the-art accuracy and robustness on two standard datasets.
Micro-expressions are typically regarded as unconscious manifestations of a person's genuine emotions. However, their short duration and subtle signals pose significant challenges for downstream recognition. We propose a multi-task learning framework named the Adaptive Motion Magnification and Sparse Mamba (AMMSM) to address this. This framework aims to enhance the accurate capture of micro-expressions through self-supervised subtle motion magnification, while the sparse spatial selection Mamba architecture combines sparse activation with the advanced Visual Mamba model to model key motion regions and their valuable representations more effectively. Additionally, we employ evolutionary search to optimize the magnification factor and the sparsity ratios of spatial selection, followed by fine-tuning to improve performance further. Extensive experiments on two standard datasets demonstrate that the proposed AMMSM achieves state-of-the-art (SOTA) accuracy and robustness.