Speech Enhancement Based on Non-stationary Noise-driven Geometric Spectral Subtraction and Phase Spectrum Compensation
This work addresses speech enhancement for noisy environments like streets or crowds, offering an incremental improvement over existing spectral subtraction techniques.
The paper tackled speech enhancement in non-stationary noise by proposing a method that updates noise estimates from current frames and compensates phase spectra, resulting in consistent outperformance over recent methods in objective measures and listening tests for street and babble noise at various SNR levels.
In this paper, a speech enhancement method based on noise compensation performed on short time magnitude as well phase spectra is presented. Unlike the conventional geometric approach (GA) to spectral subtraction (SS), here the noise estimate to be subtracted from the noisy speech spectrum is proposed to be determined by exploiting the low frequency regions of current frame of noisy speech rather than depending only on the initial silence frames. This approach gives the capability of tracking non-stationary noise thus resulting in a non-stationary noise-driven geometric approach of spectral subtraction for speech enhancement. The noise compensated magnitude spectrum from the GA step is then recombined with unchanged phase of noisy speech spectrum and used in phase compensation to obtain an enhanced complex spectrum, which is used to produce an enhanced speech frame. Extensive simulations are carried out using speech files available in the NOIZEUS database shows that the proposed method consistently outperforms some of the recent methods of speech enhancement when employed on the noisy speeches corrupted by street or babble noise at different levels of SNR in terms of objective measures, spectrogram analysis and formal subjective listening tests.