SDCLASJan 2, 2020

Excitation-based Voice Quality Analysis and Modification

arXiv:2001.00582v1
AI Analysis

This work addresses voice quality modification in speech synthesis, but it is incremental as it builds on existing HMM-based methods with specific excitation rules.

The paper analyzed excitation differences across modal, soft, and loud voice qualities in a speaker corpus and used these insights to develop a voice quality transformation system for HMM-based speech synthesis, effectively achieving transformations while maintaining quality.

This paper investigates the differences occuring in the excitation for different voice qualities. Its goal is two-fold. First a large corpus containing three voice qualities (modal, soft and loud) uttered by the same speaker is analyzed and significant differences in characteristics extracted from the excitation are observed. Secondly rules of modification derived from the analysis are used to build a voice quality transformation system applied as a post-process to HMM-based speech synthesis. The system is shown to effectively achieve the transformations while maintaining the delivered quality.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes