AS SD SPOct 25, 2019

Adaptive blind audio source extraction supervised by dominant speaker identification using x-vectors

Jakub Janský, Jiří Málek, Jaroslav Čmejla, Tomáš Kounovský, Zbyněk Koldovský, Jindřich Žďánský

arXiv:1910.11824v17.331 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of speaker separation in noisy environments, which is incremental as it builds on independent vector analysis with a novel supervision mechanism.

The paper tackles the problem of extracting a target speaker's voice from a noisy mixture with interfering speakers by proposing an adaptive blind audio source extraction algorithm supervised by x-vector-based dominant speaker identification, achieving effective extraction in scenarios with moving sources and static interference.

We propose a novel algorithm for adaptive blind audio source extraction. The proposed method is based on independent vector analysis and utilizes the auxiliary function optimization to achieve high convergence speed. The algorithm is partially supervised by a pilot signal related to the source of interest (SOI), which ensures that the method correctly extracts the utterance of the desired speaker. The pilot is based on the identification of a dominant speaker in the mixture using x-vectors. The properties of the x-vectors computed in the presence of cross-talk are experimentally analyzed. The proposed approach is verified in a scenario with a moving SOI, static interfering speaker, and environmental noise.

View on arXiv PDF

Similar