SDLGApr 29, 2016

Joint Sound Source Separation and Speaker Recognition

arXiv:1604.08852v13 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of speaker recognition in noisy, multi-speaker environments, which is incremental by extending existing NMF techniques.

The paper tackled the problem of performing speaker recognition on simultaneous speech by jointly solving source separation and speaker recognition using Non-negative Matrix Factorization (NMF), resulting in improved performance over sequential methods as demonstrated on the CHiME corpus.

Non-negative Matrix Factorization (NMF) has already been applied to learn speaker characterizations from single or non-simultaneous speech for speaker recognition applications. It is also known for its good performance in (blind) source separation for simultaneous speech. This paper explains how NMF can be used to jointly solve the two problems in a multichannel speaker recognizer for simultaneous speech. It is shown how state-of-the-art multichannel NMF for blind source separation can be easily extended to incorporate speaker recognition. Experiments on the CHiME corpus show that this method outperforms the sequential approach of first applying source separation, followed by speaker recognition that uses state-of-the-art i-vector techniques.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes