SDCLASMay 27, 2019

ET-GAN: Cross-Language Emotion Transfer Based on Cycle-Consistent Generative Adversarial Networks

arXiv:1905.11173v310 citations
Originality Incremental advance
AI Analysis

This addresses a challenge in cross-language emotion transfer for speech synthesis, offering a novel approach without aligned data, though it is incremental in building on existing GAN methods.

The paper tackles the problem of transferring emotion information across different languages in speech without requiring parallel training data, achieving high-quality emotional speech generation for any given emotion category.

Despite the remarkable progress made in synthesizing emotional speech from text, it is still challenging to provide emotion information to existing speech segments. Previous methods mainly rely on parallel data, and few works have studied the generalization ability for one model to transfer emotion information across different languages. To cope with such problems, we propose an emotion transfer system named ET-GAN, for learning language-independent emotion transfer from one emotion to another without parallel training samples. Based on cycle-consistent generative adversarial network, our method ensures the transfer of only emotion information across speeches with simple loss designs. Besides, we introduce an approach for migrating emotion information across different languages by using transfer learning. The experiment results show that our method can efficiently generate high-quality emotional speech for any given emotion category, without aligned speech pairs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes