Sébastien Le Maguer

2papers

2 Papers

CLDec 13, 2017Code

Creating New Language and Voice Components for the Updated MaryTTS Text-to-Speech Synthesis Platform

Ingmar Steiner, Sébastien Le Maguer

We present a new workflow to create components for the MaryTTS text-to-speech synthesis platform, which is popular with researchers and developers, extending it to support new languages and custom synthetic voices. This workflow replaces the previous toolkit with an efficient, flexible process that leverages modern build automation and cloud-hosted infrastructure. Moreover, it is compatible with the updated MaryTTS architecture, enabling new features and state-of-the-art paradigms such as synthesis based on deep neural networks (DNNs). Like MaryTTS itself, the new tools are free, open source software (FOSS), and promote the use of open data.

HCDec 30, 2016

Synthesis of Tongue Motion and Acoustics from Text using a Multimodal Articulatory Database

Ingmar Steiner, Sébastien Le Maguer, Alexander Hewer

We present an end-to-end text-to-speech (TTS) synthesis system that generates audio and synchronized tongue motion directly from text. This is achieved by adapting a 3D model of the tongue surface to an articulatory dataset and training a statistical parametric speech synthesis system directly on the tongue model parameters. We evaluate the model at every step by comparing the spatial coordinates of predicted articulatory movements against the reference data. The results indicate a global mean Euclidean distance of less than 2.8 mm, and our approach can be adapted to add an articulatory modality to conventional TTS applications without the need for extra data.