ASSDJun 1, 2018

DNN Based Speech Enhancement for Unseen Noises Using Monte Carlo Dropout

arXiv:1806.00516v12 citations
Originality Incremental advance
AI Analysis

This addresses speech enhancement for noisy environments, but it is incremental as it builds on existing dropout and ensemble methods.

The paper tackles the problem of speech enhancement for unseen noises by using Monte Carlo dropout as a Bayesian estimator to improve DNN generalizability, achieving better performance in unseen noise and SNR conditions, though no concrete numbers are provided.

In this work, we propose the use of dropouts as a Bayesian estimator for increasing the generalizability of a deep neural network (DNN) for speech enhancement. By using Monte Carlo (MC) dropout, we show that the DNN performs better enhancement in unseen noise and SNR conditions. The DNN is trained on speech corrupted with Factory2, M109, Babble, Leopard and Volvo noises at SNRs of 0, 5 and 10 dB and tested on speech with white, pink and factory1 noises. Speech samples are obtained from the TIMIT database and noises from NOISEX-92. In another experiment, we train five DNN models separately on speech corrupted with Factory2, M109, Babble, Leopard and Volvo noises, at 0, 5 and 10 dB SNRs. The model precision (estimated using MC dropout) is used as a proxy for squared error to dynamically select the best of the DNN models based on their performance on each frame of test data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes