SDLGSep 16, 2017

Nonnegative HMM for Babble Noise Derived from Speech HMM: Application to Speech Enhancement

arXiv:1709.05559v138 citations
Originality Incremental advance
AI Analysis

This addresses the cocktail party problem for speech processing applications, offering an incremental improvement by explicitly modeling babble as a sum of speech waveforms.

The paper tackled modeling multitalker babble noise to improve speech enhancement by developing a gamma nonnegative HMM derived from speech HMM, leading to a noise reduction algorithm that significantly outperformed conventional methods in objective and subjective evaluations.

Deriving a good model for multitalker babble noise can facilitate different speech processing algorithms, e.g. noise reduction, to reduce the so-called cocktail party difficulty. In the available systems, the fact that the babble waveform is generated as a sum of N different speech waveforms is not exploited explicitly. In this paper, first we develop a gamma hidden Markov model for power spectra of the speech signal, and then formulate it as a sparse nonnegative matrix factorization (NMF). Second, the sparse NMF is extended by relaxing the sparsity constraint, and a novel model for babble noise (gamma nonnegative HMM) is proposed in which the babble basis matrix is the same as the speech basis matrix, and only the activation factors (weights) of the basis vectors are different for the two signals over time. Finally, a noise reduction algorithm is proposed using the derived speech and babble models. All of the stationary model parameters are estimated using the expectation-maximization (EM) algorithm, whereas the time-varying parameters, i.e. the gain parameters of speech and babble signals, are estimated using a recursive EM algorithm. The objective and subjective listening evaluations show that the proposed babble model and the final noise reduction algorithm significantly outperform the conventional methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes