WaveFormer: A Lightweight Transformer Model for sEMG-based Gesture Recognition
This addresses the problem of deploying accurate gesture recognition on resource-constrained embedded systems for prosthetic and robotic control, though it is incremental as it builds on transformer architectures.
The paper tackles the challenge of classifying similar gestures from sEMG signals with high accuracy and low computational cost, achieving 95% accuracy on the EPN612 dataset and 6.75 ms inference latency with INT8 quantization.
Human-machine interaction, particularly in prosthetic and robotic control, has seen progress with gesture recognition via surface electromyographic (sEMG) signals.However, classifying similar gestures that produce nearly identical muscle signals remains a challenge, often reducing classification accuracy. Traditional deep learning models for sEMG gesture recognition are large and computationally expensive, limiting their deployment on resource-constrained embedded systems. In this work, we propose WaveFormer, a lightweight transformer-based architecture tailored for sEMG gesture recognition. Our model integrates time-domain and frequency-domain features through a novel learnable wavelet transform, enhancing feature extraction. In particular, the WaveletConv module, a multi-level wavelet decomposition layer with depthwise separable convolution, ensures both efficiency and compactness. With just 3.1 million parameters, WaveFormer achieves 95% classification accuracy on the EPN612 dataset, outperforming larger models. Furthermore, when profiled on a laptop equipped with an Intel CPU, INT8 quantization achieves real-time deployment with a 6.75 ms inference latency.