SDAIOct 31, 2024

The ISCSLP 2024 Conversational Voice Clone (CoVoC) Challenge: Tasks, Results and Findings

arXiv:2411.00064v14 citationsh-index: 14Has CodeISCSLP
Originality Synthesis-oriented
AI Analysis

This work addresses the need for standardized evaluation in voice cloning for conversational AI applications, but it is incremental as it builds on existing challenges and datasets.

The paper tackled the problem of benchmarking zero-shot spontaneous style voice cloning in conversational speech by organizing the ISCSLP 2024 CoVoC Challenge, which included two tracks and a 100-hour dataset, and reported evaluation results and findings from submitted systems.

The ISCSLP 2024 Conversational Voice Clone (CoVoC) Challenge aims to benchmark and advance zero-shot spontaneous style voice cloning, particularly focusing on generating spontaneous behaviors in conversational speech. The challenge comprises two tracks: an unconstrained track without limitation on data and model usage, and a constrained track only allowing the use of constrained open-source datasets. A 100-hour high-quality conversational speech dataset is also made available with the challenge. This paper details the data, tracks, submitted systems, evaluation results, and findings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes