CLSDApr 20, 2025

DialogueAgents: A Hybrid Agent-Based Speech Synthesis Framework for Multi-Party Dialogue

arXiv:2504.14482v15 citationsh-index: 9Has CodeICME
Originality Incremental advance
AI Analysis

This work addresses the problem of expensive and low-diversity speech synthesis datasets for researchers in human-computer interaction, though it is incremental as it builds on existing agent-based and synthesis methods.

The authors tackled the high cost and limited diversity of speech synthesis datasets by proposing DialogueAgents, a hybrid agent-based framework that collaboratively generates dialogues, resulting in the creation of the MultiTalk dataset with improved emotional expressiveness.

Speech synthesis is crucial for human-computer interaction, enabling natural and intuitive communication. However, existing datasets involve high construction costs due to manual annotation and suffer from limited character diversity, contextual scenarios, and emotional expressiveness. To address these issues, we propose DialogueAgents, a novel hybrid agent-based speech synthesis framework, which integrates three specialized agents -- a script writer, a speech synthesizer, and a dialogue critic -- to collaboratively generate dialogues. Grounded in a diverse character pool, the framework iteratively refines dialogue scripts and synthesizes speech based on speech review, boosting emotional expressiveness and paralinguistic features of the synthesized dialogues. Using DialogueAgent, we contribute MultiTalk, a bilingual, multi-party, multi-turn speech dialogue dataset covering diverse topics. Extensive experiments demonstrate the effectiveness of our framework and the high quality of the MultiTalk dataset. We release the dataset and code https://github.com/uirlx/DialogueAgents to facilitate future research on advanced speech synthesis models and customized data generation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes