SDCLLGASMay 2, 2018

Convolutional-Recurrent Neural Networks for Speech Enhancement

arXiv:1805.00579v1184 citations
Originality Incremental advance
AI Analysis

This work addresses speech enhancement for applications like communication systems, but it is incremental as it combines existing neural network architectures.

The paper tackled speech enhancement by proposing an end-to-end convolutional-recurrent neural network model that exploits local structures in frequency and temporal domains, resulting in improvements of up to 0.6 PESQ on seen noise and 0.64 on unseen noise.

We propose an end-to-end model based on convolutional and recurrent neural networks for speech enhancement. Our model is purely data-driven and does not make any assumptions about the type or the stationarity of the noise. In contrast to existing methods that use multilayer perceptrons (MLPs), we employ both convolutional and recurrent neural network architectures. Thus, our approach allows us to exploit local structures in both the frequency and temporal domains. By incorporating prior knowledge of speech signals into the design of model structures, we build a model that is more data-efficient and achieves better generalization on both seen and unseen noise. Based on experiments with synthetic data, we demonstrate that our model outperforms existing methods, improving PESQ by up to 0.6 on seen noise and 0.64 on unseen noise.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes