Modulation-Domain Kalman Filtering for Monaural Blind Speech Denoising and Dereverberation
This addresses speech enhancement for applications like hearing aids or communication systems by improving clarity in noisy, reverberant environments, but it is incremental as it builds on existing Kalman filtering methods.
The paper tackled blind joint denoising and dereverberation of monaural speech by proposing a modulation-domain Kalman filtering algorithm that estimates speech log-magnitude spectra while tracking reverberation parameters like T60 and DRR. Experimental results showed effectiveness in improving speech quality, intelligibility, and dereverberation performance across various noise types and SNRs.
We describe a monaural speech enhancement algorithm based on modulation-domain Kalman filtering to blindly track the time-frequency log-magnitude spectra of speech and reverberation. We propose an adaptive algorithm that performs blind joint denoising and dereverberation, while accounting for the inter-frame speech dynamics, by estimating the posterior distribution of the speech log-magnitude spectrum given the log-magnitude spectrum of the noisy reverberant speech. The Kalman filter update step models the non-linear relations between the speech, noise and reverberation log-spectra. The Kalman filtering algorithm uses a signal model that takes into account the reverberation parameters of the reverberation time, $T_{60}$, and the direct-to-reverberant energy ratio (DRR) and also estimates and tracks the $T_{60}$ and the DRR in every frequency bin in order to improve the estimation of the speech log-magnitude spectrum. The Kalman filtering algorithm is tested and graphs that depict the estimated reverberation features over time are examined. The proposed algorithm is evaluated in terms of speech quality, speech intelligibility and dereverberation performance for a range of reverberation parameters and SNRs, in different noise types, and is also compared to competing denoising and dereverberation techniques. Experimental results using noisy reverberant speech demonstrate the effectiveness of the enhancement algorithm.