SDASMay 31, 2021

Noise Classification Aided Attention-Based Neural Network for Monaural Speech Enhancement

arXiv:2105.14719v1
Originality Incremental advance
AI Analysis

This work addresses speech enhancement for audio processing applications, but it is incremental as it builds on a previous method by adding noise classification and attention mechanisms.

The paper tackles monaural speech enhancement by proposing a noise classification aided attention-based neural network, which achieves better speech quality (PESQ) compared to OM-LSA and a previous work, with improved generalization to unseen noise conditions.

This paper proposes an noise type classification aided attention-based neural network approach for monaural speech enhancement. The network is constructed based on a previous work by introducing a noise classification subnetwork into the structure and taking the classification embedding into the attention mechanism for guiding the network to make better feature extraction. Specifically, to make the network an end-to-end way, an audio encoder and decoder constructed by temporal convolution is used to make transformation between waveform and spectrogram. Additionally, our model is composed of two long short term memory (LSTM) based encoders, two attention mechanism, a noise classifier and a speech mask generator. Experiments show that, compared with OM-LSA and the previous work, the proposed noise classification aided attention-based approach can achieve better performance in terms of speech quality (PESQ). More promisingly, our approach has better generalization ability to unseen noise conditions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes