SDMar 16

Cepstral Smoothing of Binary Masks for Convolutive Blind Separation of Speech Mixtures

arXiv:2603.1498322.44 citationsh-index: 19
AI Analysis

This addresses speech separation for audio processing applications, but it is incremental as it builds on existing blind source separation techniques.

The paper tackled the problem of musical noise in blind source separation of speech mixtures by applying cepstral smoothing to binary time-frequency masks, resulting in promising effectiveness in experiments with both simulated and real recordings.

In this paper, we propose a novel separation system for extracting two speech signals from two microphone recordings. Our system combines the blind source separation technique with cepstral smoothing of binary time-frequency masks. The last is composed of two steps. First, the two binary masks are estimated from the separated output signals of BSS algorithm. In the second step, a cepstral smoothing is applied of these spectral masks in order to reduce musical noise typically produced by time-frequency masking. Experiments were carried out with both artificially mixed speech signals using simulated room model and two real recordings. The evaluation results are promising and have shown the effectiveness of our system.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes