LGSDASMLDec 19, 2019

Practical applicability of deep neural networks for overlapping speaker separation

arXiv:1912.09261v1
Originality Synthesis-oriented
AI Analysis

This addresses the problem of speaker separation in noisy, multilingual settings for audio processing applications, but it is incremental as it builds on existing deep learning methods.

The paper evaluated the practical applicability of deep clustering and deep attractor networks for overlapping speaker separation, showing they work across multiple languages with limited performance loss for untrained languages and proposing modifications to handle background noise.

This paper examines the applicability in realistic scenarios of two deep learning based solutions to the overlapping speaker separation problem. Firstly, we present experiments that show that these methods are applicable for a broad range of languages. Further experimentation indicates limited performance loss for untrained languages, when these have common features with the trained language(s). Secondly, it investigates how the methods deal with realistic background noise and proposes some modifications to better cope with these disturbances. The deep learning methods that will be examined are deep clustering and deep attractor networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes