An Objective Evaluation Framework for Pathological Speech Synthesis
This work addresses the problem of inconsistent evaluation in pathological speech synthesis for researchers and developers, but it is incremental as it builds on existing detection and analysis methods.
The authors tackled the lack of a standardized objective evaluation framework for pathological speech synthesis by proposing a general framework that assesses voice quality and intelligibility, and they developed a dysarthric voice conversion system using CycleGAN-VC and PSOLA-based techniques to synthesize speech with varying intelligibility levels.
The development of pathological speech systems is currently hindered by the lack of a standardised objective evaluation framework. In this work, (1) we utilise existing detection and analysis techniques to propose a general framework for the consistent evaluation of synthetic pathological speech. This framework evaluates the voice quality and the intelligibility aspects of speech and is shown to be complementary using our experiments. (2) Using our proposed evaluation framework, we develop and test a dysarthric voice conversion system (VC) using CycleGAN-VC and a PSOLA-based speech rate modification technique. We show that the developed system is able to synthesise dysarthric speech with different levels of speech intelligibility.