SDCLASAug 16, 2023

ChinaTelecom System Description to VoxCeleb Speaker Recognition Challenge 2023

arXiv:2308.08181v1h-index: 2
Originality Synthesis-oriented
AI Analysis

This work addresses speaker recognition for audio processing, but it is incremental as it builds on existing ResNet methods with fusion and calibration.

The paper tackled speaker recognition in the VoxCeleb2023 challenge by fusing multiple ResNet variants trained on VoxCeleb2 and applying score calibration, resulting in a minDCF of 0.1066 and EER of 1.980%.

This technical report describes ChinaTelecom system for Track 1 (closed) of the VoxCeleb2023 Speaker Recognition Challenge (VoxSRC 2023). Our system consists of several ResNet variants trained only on VoxCeleb2, which were fused for better performance later. Score calibration was also applied for each variant and the fused system. The final submission achieved minDCF of 0.1066 and EER of 1.980%.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes