ASAICLJul 3, 2022

DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech

arXiv:2207.01063v389 citationsh-index: 6
Originality Synthesis-oriented
AI Analysis

This provides a dataset for researchers in conversational TTS, but it is incremental as it builds on existing dialogue data and baseline methods.

The authors tackled the lack of conversational aspects in Text-to-Speech (TTS) datasets by introducing DailyTalk, a high-quality conversational speech dataset with 2,541 dialogues, and showed that it can be used for general TTS and that their baseline model can represent contextual information.

The majority of current Text-to-Speech (TTS) datasets, which are collections of individual utterances, contain few conversational aspects. In this paper, we introduce DailyTalk, a high-quality conversational speech dataset designed for conversational TTS. We sampled, modified, and recorded 2,541 dialogues from the open-domain dialogue dataset DailyDialog inheriting its annotated attributes. On top of our dataset, we extend prior work as our baseline, where a non-autoregressive TTS is conditioned on historical information in a dialogue. From the baseline experiment with both general and our novel metrics, we show that DailyTalk can be used as a general TTS dataset, and more than that, our baseline can represent contextual information from DailyTalk. The DailyTalk dataset and baseline code are freely available for academic use with CC-BY-SA 4.0 license.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes