SDASMar 29, 2020

A Recursive Network with Dynamic Attention for Monaural Speech Enhancement

arXiv:2003.12973v335 citations
AI Analysis

This work addresses speech enhancement in noisy environments, which is important for applications like hearing aids and communication systems, but it appears incremental as it builds on existing attention and recursive methods.

The authors tackled monaural speech enhancement by proposing a framework that combines dynamic attention and recursive learning, achieving consistently better performance than recent state-of-the-art models on the TIMIT corpus in terms of PESQ and STOI scores.

A person tends to generate dynamic attention towards speech under complicated environments. Based on this phenomenon, we propose a framework combining dynamic attention and recursive learning together for monaural speech enhancement. Apart from a major noise reduction network, we design a separated sub-network, which adaptively generates the attention distribution to control the information flow throughout the major network. To effectively decrease the number of trainable parameters, recursive learning is introduced, which means that the network is reused for multiple stages, where the intermediate output in each stage is correlated with a memory mechanism. As a result, a more flexible and better estimation can be obtained. We conduct experiments on TIMIT corpus. Experimental results show that the proposed architecture obtains consistently better performance than recent state-of-the-art models in terms of both PESQ and STOI scores.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes