ASLGSDFeb 5, 2024

An Attention Long Short-Term Memory based system for automatic classification of speech intelligibility

arXiv:2402.02850v148 citationsh-index: 19Eng appl artif intell
Originality Incremental advance
AI Analysis

This work addresses speech intelligibility assessment for dysarthric individuals, but it is incremental as it adapts existing attention mechanisms to a specific domain.

The paper tackled automatic classification of speech intelligibility in dysarthric speech by developing an attention-enhanced LSTM system, which outperformed SVM and LSTM baselines on the UA-Speech database.

Speech intelligibility can be degraded due to multiple factors, such as noisy environments, technical difficulties or biological conditions. This work is focused on the development of an automatic non-intrusive system for predicting the speech intelligibility level in this latter case. The main contribution of our research on this topic is the use of Long Short-Term Memory (LSTM) networks with log-mel spectrograms as input features for this purpose. In addition, this LSTM-based system is further enhanced by the incorporation of a simple attention mechanism that is able to determine the more relevant frames to this task. The proposed models are evaluated with the UA-Speech database that contains dysarthric speech with different degrees of severity. Results show that the attention LSTM architecture outperforms both, a reference Support Vector Machine (SVM)-based system with hand-crafted features and a LSTM-based system with Mean-Pooling.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes