Speaker and Posture Classification using Instantaneous Intraspeech Breathing Features
This work addresses privacy issues in biometric identification and activity detection by using breathing sounds, but it is incremental as it applies existing methods to a new dataset.
The paper tackled speaker and posture classification by using intraspeech breathing sounds instead of acoustic speech features to address privacy concerns, achieving 87% accuracy for speaker classification and 98% for posture classification.
Acoustic features extracted from speech are widely used in problems such as biometric speaker identification and first-person activity detection. However, the use of speech for such purposes raises privacy issues as the content is accessible to the processing party. In this work, we propose a method for speaker and posture classification using intraspeech breathing sounds. Instantaneous magnitude features are extracted using the Hilbert-Huang transform (HHT) and fed into a CNN-GRU network for classification of recordings from the open intraspeech breathing sound dataset, BreathBase, that we collected for this study. Using intraspeech breathing sounds, 87% speaker classification, and 98% posture classification accuracy were obtained.