SDASNov 8, 2020

Denoising-and-Dereverberation Hierarchical Neural Vocoder for Robust Waveform Generation

arXiv:2011.03955v14 citations
Originality Incremental advance
AI Analysis

This work addresses speech enhancement for applications like communication systems, but it is incremental as it modifies an existing vocoder method.

The paper tackled the problem of generating clean speech waveforms from noisy and reverberant acoustic features by proposing a denoising and dereverberation hierarchical neural vocoder (DNR-HiNet), which outperformed the original HiNet vocoder and other neural vocoders in experiments.

This paper presents a denoising and dereverberation hierarchical neural vocoder (DNR-HiNet) to convert noisy and reverberant acoustic features into a clean speech waveform. We implement it mainly by modifying the amplitude spectrum predictor (ASP) in the original HiNet vocoder. This modified denoising and dereverberation ASP (DNR-ASP) can predict clean log amplitude spectra (LAS) from input degraded acoustic features. To achieve this, the DNR-ASP first predicts the noisy and reverberant LAS, noise LAS related to the noise information, and room impulse response related to the reverberation information then performs initial denoising and dereverberation. The initial processed LAS are then enhanced by another neural network as the final clean LAS. To further improve the quality of the generated clean LAS, we also introduce a bandwidth extension model and frequency resolution extension model in the DNR-ASP. The experimental results indicate that the DNR-HiNet vocoder was able to generate a denoised and dereverberated waveform given noisy and reverberant acoustic features and outperformed the original HiNet vocoder and a few other neural vocoders. We also applied the DNR-HiNet vocoder to speech enhancement tasks, and its performance was competitive with several advanced speech enhancement methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes