SDCLASJun 26, 2023

The Singing Voice Conversion Challenge 2023

arXiv:2306.14422v275 citationsh-index: 55
Originality Synthesis-oriented
AI Analysis

This work addresses benchmarking for singing voice conversion systems, providing incremental updates to the voice conversion challenge series.

The Singing Voice Conversion Challenge 2023 tackled singing voice conversion by comparing systems on in-domain and cross-domain tasks, finding that top systems achieved human-level naturalness but fell short in similarity scores, with cross-domain tasks being harder.

We present the latest iteration of the voice conversion challenge (VCC) series, a bi-annual scientific event aiming to compare and understand different voice conversion (VC) systems based on a common dataset. This year we shifted our focus to singing voice conversion (SVC), thus named the challenge the Singing Voice Conversion Challenge (SVCC). A new database was constructed for two tasks, namely in-domain and cross-domain SVC. The challenge was run for two months, and in total we received 26 submissions, including 2 baselines. Through a large-scale crowd-sourced listening test, we observed that for both tasks, although human-level naturalness was achieved by the top system, no team was able to obtain a similarity score as high as the target speakers. Also, as expected, cross-domain SVC is harder than in-domain SVC, especially in the similarity aspect. We also investigated whether existing objective measurements were able to predict perceptual performance, and found that only few of them could reach a significant correlation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes