Integrated Speech Enhancement Method Based on Weighted Prediction Error and DNN for Dereverberation and Denoising
This work addresses speech quality and intelligibility issues in audio processing, but it is incremental as it builds on existing WPE methods by adding DNN components.
The paper tackles the problem of speech degradation from reverberation and additive noise by integrating a deep neural network (DNN) into the weighted prediction error (WPE) method for dereverberation and denoising, resulting in significant improvements in speech quality and faster runtime.
Both reverberation and additive noises degrade the speech quality and intelligibility. Weighted prediction error (WPE) method performs well on the dereverberation but with limitations. First, WPE doesn't consider the influence of the additive noise which degrades the performance of dereverberation. Second, it relies on a time-consuming iterative process, and there is no guarantee or a widely accepted criterion on its convergence. In this paper, we integrate deep neural network (DNN) into WPE for dereverberation and denoising. DNN is used to suppress the background noise to meet the noise-free assumption of WPE. Meanwhile, DNN is applied to directly predict spectral variance of the target speech to make the WPE work without iteration. The experimental results show that the proposed method has a significant improvement in speech quality and runs fast.