SD CL ASOct 23, 2020

EML System Description for VoxCeleb Speaker Diarization Challenge 2020

arXiv:2010.12497v11.9

Originality Synthesis-oriented

AI Analysis

This is an incremental improvement for speaker diarization systems, addressing efficiency and accuracy in audio processing tasks.

The paper tackled speaker diarization by adapting an online algorithm for offline processing in the VoxCeleb challenge, achieving better accuracy with DER and JER compared to the baseline and a real-time factor of 0.01 on a single CPU.

This technical report describes the EML submission to the first VoxCeleb speaker diarization challenge. Although the aim of the challenge has been the offline processing of the signals, the submitted system is basically the EML online algorithm which decides about the speaker labels in runtime approximately every 1.2 sec. For the first phase of the challenge, only VoxCeleb2 dev dataset was used for training. The results on the provided VoxConverse dev set show much better accuracy in terms of both DER and JER compared to the offline baseline provided in the challenge. The real-time factor of the whole diarization process is about 0.01 using a single CPU machine.

View on arXiv PDF

Similar