Target Speaker Extraction for Overlapped Multi-Talker Speaker Verification
This addresses the issue of speaker verification failure in noisy, multi-speaker environments, which is incremental as it builds on existing target speaker extraction methods.
The paper tackles the problem of speaker verification performance degradation in overlapped multi-talker scenarios by proposing a framework that first extracts the target speaker's speech using a target speaker extraction module and then passes it to the verification system, achieving a 65.7% relative EER reduction.
The performance of speaker verification degrades significantly when the test speech is corrupted by interference speakers. Speaker diarization does well to separate speakers if the speakers are temporally overlapped. However, if multi-talkers speak at the same time, we need the technique to separate the speech in the spectral domain. This paper proposes an overlapped multi-talker speaker verification framework by using target speaker extraction methods. Specifically, given the target speaker information, the target speaker's speech is firstly extracted from the overlapped multi-talker speech by a target speaker extraction module. Then, the extracted speech is passed to the speaker verification system. Experimental results show that the proposed approach significantly improves the performance of overlapped multi-talker speaker verification and achieves 65.7% relative EER reduction.