SDCLASDec 29, 2019

Complex Cepstrum-based Decomposition of Speech for Glottal Source Estimation

arXiv:1912.12602v181 citations
Originality Incremental advance
AI Analysis

This is an incremental improvement for speech processing researchers, offering a faster alternative to existing methods for glottal flow estimation.

The paper tackles glottal source estimation in speech processing by proposing a complex cepstrum-based decomposition method, achieving much higher speed compared to the Zeros of the Z-Transform (ZZT) decomposition.

Homomorphic analysis is a well-known method for the separation of non-linearly combined signals. More particularly, the use of complex cepstrum for source-tract deconvolution has been discussed in various articles. However there exists no study which proposes a glottal flow estimation methodology based on cepstrum and reports effective results. In this paper, we show that complex cepstrum can be effectively used for glottal flow estimation by separating the causal and anticausal components of a windowed speech signal as done by the Zeros of the Z-Transform (ZZT) decomposition. Based on exactly the same principles presented for ZZT decomposition, windowing should be applied such that the windowed speech signals exhibit mixed-phase characteristics which conform the speech production model that the anticausal component is mainly due to the glottal flow open phase. The advantage of the complex cepstrum-based approach compared to the ZZT decomposition is its much higher speed.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes