SD CL ASJan 17, 2023

The Newsbridge -Telecom SudParis VoxCeleb Speaker Recognition Challenge 2022 System Description

Yannis Tevissen, Jérôme Boudy, Frédéric Petitpont

arXiv:2301.07491v12.31 citationsh-index: 19

Originality Synthesis-oriented

AI Analysis

This work addresses speaker diarization for audio processing applications, but it is incremental as it focuses on improving voice activity detection within an existing baseline framework.

The paper tackled speaker diarization in the VoxSRC 2022 challenge by developing a multi-stream voice activity detection method with a decision protocol based on classifier entropy, achieving close to state-of-the-art results.

We describe the system used by our team for the VoxCeleb Speaker Recognition Challenge 2022 (VoxSRC 2022) in the speaker diarization track. Our solution was designed around a new combination of voice activity detection algorithms that uses the strengths of several systems. We introduce a novel multi stream approach with a decision protocol based on classifiers entropy. We called this method a multi-stream voice activity detection and used it with standard baseline diarization embeddings, clustering and resegmentation. With this work, we successfully demonstrated that using a strong baseline and working only on voice activity detection, one can achieved close to state-of-theart results.

View on arXiv PDF

Similar