AS AI SDMay 20, 2025

Articulatory Feature Prediction from Surface EMG during Speech Production

Jihwan Lee, Kevin Huang, Kleanthis Avramidis, Simon Pistrosch, Monica Gonzalez-Machorro, Yoonjeong Lee, Björn Schuller, Louis Goldstein, Shrikanth Narayanan

arXiv:2505.13814v24.35 citationsh-index: 7Has CodeINTERSPEECH

Originality Highly original

AI Analysis

This provides a novel approach to EMG-based speech synthesis, potentially benefiting speech-impaired individuals or human-computer interaction applications.

The researchers tackled the problem of predicting articulatory features from surface EMG signals during speech production, achieving a high prediction correlation of approximately 0.9 for most features and demonstrating that these features can be decoded into intelligible speech waveforms.

We present a model for predicting articulatory features from surface electromyography (EMG) signals during speech production. The proposed model integrates convolutional layers and a Transformer block, followed by separate predictors for articulatory features. Our approach achieves a high prediction correlation of approximately 0.9 for most articulatory features. Furthermore, we demonstrate that these predicted articulatory features can be decoded into intelligible speech waveforms. To our knowledge, this is the first method to decode speech waveforms from surface EMG via articulatory features, offering a novel approach to EMG-based speech synthesis. Additionally, we analyze the relationship between EMG electrode placement and articulatory feature predictability, providing knowledge-driven insights for optimizing EMG electrode configurations. The source code and decoded speech samples are publicly available.

View on arXiv PDF Code

Similar