SDCLASJun 25, 2018

Single-channel Speech Dereverberation via Generative Adversarial Training

arXiv:1806.09325v113 citations
Originality Incremental advance
AI Analysis

This work addresses speech quality enhancement in reverberant environments, which is important for applications like hearing aids and voice assistants, but it appears incremental as it builds on existing neural network and GAN methods.

The paper tackles single-channel speech dereverberation by proposing DeReGAT, a system using a CBLDNN with generative adversarial training to improve speech quality beyond MSE minimization, and it outperforms WPE and DNN-based systems while adapting to variant environments.

In this paper, we propose a single-channel speech dereverberation system (DeReGAT) based on convolutional, bidirectional long short-term memory and deep feed-forward neural network (CBLDNN) with generative adversarial training (GAT). In order to obtain better speech quality instead of only minimizing a mean square error (MSE), GAT is employed to make the dereverberated speech indistinguishable form the clean samples. Besides, our system can deal with wide range reverberation and be well adapted to variant environments. The experimental results show that the proposed model outperforms weighted prediction error (WPE) and deep neural network-based systems. In addition, DeReGAT is extended to an online speech dereverberation scenario, which reports comparable performance with the offline case.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes