CLAIASJun 25, 2018

The Emotional Voices Database: Towards Controlling the Emotion Dimension in Voice Generation Systems

arXiv:1806.09514v1105 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This work provides a resource for researchers and developers to control emotion in voice generation systems, though it is incremental as it builds on existing data collection and synthesis methods.

The authors introduced an open-source emotional speech database with male and female actors in English and a male actor in French, covering 5 emotion classes, and demonstrated its utility by building a simple MLP system that converts neutral to angry speech, achieving positive results in a CMOS perception test.

In this paper, we present a database of emotional speech intended to be open-sourced and used for synthesis and generation purpose. It contains data for male and female actors in English and a male actor in French. The database covers 5 emotion classes so it could be suitable to build synthesis and voice transformation systems with the potential to control the emotional dimension in a continuous way. We show the data's efficiency by building a simple MLP system converting neutral to angry speech style and evaluate it via a CMOS perception test. Even though the system is a very simple one, the test show the efficiency of the data which is promising for future work.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes