SDCLASDec 29, 2019

Glottal Source Processing: from Analysis to Applications

arXiv:1912.12604v1115 citations
Originality Synthesis-oriented
AI Analysis

It tackles the problem of enhancing voice technology applications by incorporating glottal flow analysis, which is often avoided due to complexity, but the work is incremental as it reviews existing techniques.

This review addresses the underutilization of glottal flow analysis in voice technology by providing an overview of techniques for glottal source processing, from fundamental tools like pitch tracking and glottal closure instant detection to their integration into applications.

The great majority of current voice technology applications relies on acoustic features characterizing the vocal tract response, such as the widely used MFCC of LPC parameters. Nonetheless, the airflow passing through the vocal folds, and called glottal flow, is expected to exhibit a relevant complementarity. Unfortunately, glottal analysis from speech recordings requires specific and more complex processing operations, which explains why it has been generally avoided. This review gives a general overview of techniques which have been designed for glottal source processing. Starting from fundamental analysis tools of pitch tracking, glottal closure instant detection, glottal flow estimation and modelling, this paper then highlights how these solutions can be properly integrated within various voice technology applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes