AS AI CLJun 16, 2023

CML-TTS A Multilingual Dataset for Speech Synthesis in Low-Resource Languages

Frederico S. Oliveira, Edresson Casanova, Arnaldo Cândido Júnior, Anderson S. Soares, Arlindo R. Galvão Filho

arXiv:2306.10097v112.223 citationsh-index: 8

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of limited data for multilingual TTS research, particularly for low-resource languages, by providing a publicly available dataset and model, though it is incremental as it adapts existing resources.

The authors tackled the lack of multilingual datasets for text-to-speech (TTS) in low-resource languages by creating CML-TTS, a dataset based on Multilingual LibriSpeech with audiobooks in seven languages, and they provided the YourTTS model trained on 3,176.13 hours from CML-TTS and 245.07 hours from LibriTTS.

In this paper, we present CML-TTS, a recursive acronym for CML-Multi-Lingual-TTS, a new Text-to-Speech (TTS) dataset developed at the Center of Excellence in Artificial Intelligence (CEIA) of the Federal University of Goias (UFG). CML-TTS is based on Multilingual LibriSpeech (MLS) and adapted for training TTS models, consisting of audiobooks in seven languages: Dutch, French, German, Italian, Portuguese, Polish, and Spanish. Additionally, we provide the YourTTS model, a multi-lingual TTS model, trained using 3,176.13 hours from CML-TTS and also with 245.07 hours from LibriTTS, in English. Our purpose in creating this dataset is to open up new research possibilities in the TTS area for multi-lingual models. The dataset is publicly available under the CC-BY 4.0 license1.

View on arXiv PDF

Similar