AS SDSep 7, 2020

Toward Speech Separation in The Pre-Cocktail Party Problem with TasTas

arXiv:2009.03692v42.31 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work addresses speech separation for noisy environments, but it is incremental as it builds on existing methods like DPRNN-TasNet.

The paper tackles monaural speech separation in the pre-cocktail party problem using TasTas, achieving a 10.41dB SDR improvement on WSJ0-5mix data, which increases to 11.14dB with augmentation.

In this note, we propose to use TasTas \cite{shi2020speech} for the end-to-end approach to monaural speech separation in the pre-cocktail party problem. Our experiments on the public WSJ0-5mix data corpus results in 10.41dB SDR improvement. If online voice data remixing augmentation \cite{zeghidour2020wavesplit} is adopted in training, an 11.14dB SDR improvement can be achieved. We have open-sourced our re-implementation of the DPRNN-TasNet in https://github.com/ShiZiqiang/dual-path-RNNs-DPRNNs-based-speech-separation, and our TasTas is realized based on this implementation of DPRNN-TasNet, it is believed that the results in this paper can be reproduced with ease.

View on arXiv PDF Code

Similar