ASLGSDMLNov 9, 2018

AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms

arXiv:1811.04076v1115 citations
Originality Incremental advance
AI Analysis

This addresses voice conversion for speech processing applications, offering a more efficient and flexible approach without needing aligned data or phoneme labels, though it is incremental as it builds on existing sequence-to-sequence techniques.

The paper tackles voice conversion by proposing a sequence-to-sequence method with attention and context preservation, achieving higher speech quality than Gaussian mixture model-based VC and comparable to recurrent neural network-based text-to-speech synthesis, with training in one day on a single GPU.

This paper describes a method based on a sequence-to-sequence learning (Seq2Seq) with attention and context preservation mechanism for voice conversion (VC) tasks. Seq2Seq has been outstanding at numerous tasks involving sequence modeling such as speech synthesis and recognition, machine translation, and image captioning. In contrast to current VC techniques, our method 1) stabilizes and accelerates the training procedure by considering guided attention and proposed context preservation losses, 2) allows not only spectral envelopes but also fundamental frequency contours and durations of speech to be converted, 3) requires no context information such as phoneme labels, and 4) requires no time-aligned source and target speech data in advance. In our experiment, the proposed VC framework can be trained in only one day, using only one GPU of an NVIDIA Tesla K80, while the quality of the synthesized speech is higher than that of speech converted by Gaussian mixture model-based VC and is comparable to that of speech generated by recurrent neural network-based text-to-speech synthesis, which can be regarded as an upper limit on VC performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes