SD CL ASDec 30, 2019

Causal-Anticausal Decomposition of Speech using Complex Cepstrum for Glottal Source Estimation

Thomas Drugman, Baris Bozkurt, Thierry Dutoit

arXiv:1912.12843v16.572 citations

Originality Synthesis-oriented

AI Analysis

This is an incremental improvement for speech analysis, potentially aiding voice quality analysis in real expressive speech.

The paper tackled glottal flow estimation in speech processing by applying complex cepstrum for causal-anticausal decomposition, showing it achieves similar accuracy to existing methods but with much higher speed due to FFT operations instead of polynomial factoring.

Complex cepstrum is known in the literature for linearly separating causal and anticausal components. Relying on advances achieved by the Zeros of the Z-Transform (ZZT) technique, we here investigate the possibility of using complex cepstrum for glottal flow estimation on a large-scale database. Via a systematic study of the windowing effects on the deconvolution quality, we show that the complex cepstrum causal-anticausal decomposition can be effectively used for glottal flow estimation when specific windowing criteria are met. It is also shown that this complex cepstral decomposition gives similar glottal estimates as obtained with the ZZT method. However, as complex cepstrum uses FFT operations instead of requiring the factoring of high-degree polynomials, the method benefits from a much higher speed. Finally in our tests on a large corpus of real expressive speech, we show that the proposed method has the potential to be used for voice quality analysis.

View on arXiv PDF

Similar