CVMay 27, 2025

MoPFormer: Motion-Primitive Transformer for Wearable-Sensor Activity Recognition

arXiv:2505.20744v26 citationsh-index: 4
Originality Incremental advance
AI Analysis

This addresses interpretability and generalization issues in wearable-sensor activity recognition, with incremental improvements in method and performance.

The paper tackles the limited interpretability and cross-dataset generalization challenge in wearable-sensor human activity recognition by proposing MoPFormer, a self-supervised framework that tokenizes sensor signals into motion primitives and uses a Transformer, achieving state-of-the-art performance on six benchmarks with improved interpretability.

Human Activity Recognition (HAR) with wearable sensors is challenged by limited interpretability, which significantly impacts cross-dataset generalization. To address this challenge, we propose Motion-Primitive Transformer (MoPFormer), a novel self-supervised framework that enhances interpretability by tokenizing inertial measurement unit signals into semantically meaningful motion primitives and leverages a Transformer architecture to learn rich temporal representations. MoPFormer comprises two stages. The first stage is to partition multi-channel sensor streams into short segments and quantize them into discrete ``motion primitive'' codewords, while the second stage enriches those tokenized sequences through a context-aware embedding module and then processes them with a Transformer encoder. The proposed MoPFormer can be pre-trained using a masked motion-modeling objective that reconstructs missing primitives, enabling it to develop robust representations across diverse sensor configurations. Experiments on six HAR benchmarks demonstrate that MoPFormer not only outperforms state-of-the-art methods but also successfully generalizes across multiple datasets. More importantly, the learned motion primitives significantly enhance both interpretability and cross-dataset performance by capturing fundamental movement patterns that remain consistent across similar activities, regardless of dataset origin.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes