ASLGSDOct 22, 2019

CycleGAN Voice Conversion of Spectral Envelopes using Adversarial Weights

arXiv:1910.12614v22 citations
Originality Incremental advance
AI Analysis

This addresses voice conversion problems for speech processing applications, representing an incremental improvement with specific training enhancements.

The paper tackles GAN optimization and stability issues in voice conversion by proposing spectral envelopes as inputs and two adversarial weight training paradigms (generalized weighted GAN and generator impact GAN) to reduce the generator's impact on the discriminator, with an energy constraint improving conversion quality. On the Voice Conversion Challenge 2018 dataset, the method achieves state-of-the-art results with reduced network complexity and outperforms a previous weighted GAN approach.

This paper tackles GAN optimization and stability issues in the context of voice conversion. First, to simplify the conversion task, we propose to use spectral envelopes as inputs. Second we propose two adversarial weight training paradigms, the generalized weighted GAN and the generator impact GAN, both aim at reducing the impact of the generator on the discriminator, so both can learn more gradually and efficiently during training. Applying an energy constraint to the cycleGAN paradigm considerably improved conversion quality. A subjective experiment conducted on a voice conversion task on the voice conversion challenge 2018 dataset shows first that despite a significantly reduced network complexity, the proposed method achieves state-of-the-art results, and second that the proposed weighted GAN methods outperform a previously proposed one.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes