SDCLASJan 17, 2023

The Newsbridge -Telecom SudParis VoxCeleb Speaker Recognition Challenge 2022 System Description

arXiv:2301.07491v11 citationsh-index: 19
Originality Synthesis-oriented
AI Analysis

This work addresses speaker diarization for audio processing applications, but it is incremental as it focuses on improving voice activity detection within an existing baseline framework.

The paper tackled speaker diarization in the VoxSRC 2022 challenge by developing a multi-stream voice activity detection method with a decision protocol based on classifier entropy, achieving close to state-of-the-art results.

We describe the system used by our team for the VoxCeleb Speaker Recognition Challenge 2022 (VoxSRC 2022) in the speaker diarization track. Our solution was designed around a new combination of voice activity detection algorithms that uses the strengths of several systems. We introduce a novel multi stream approach with a decision protocol based on classifiers entropy. We called this method a multi-stream voice activity detection and used it with standard baseline diarization embeddings, clustering and resegmentation. With this work, we successfully demonstrated that using a strong baseline and working only on voice activity detection, one can achieved close to state-of-theart results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes