Continuous Wavelet Vocoder-based Decomposition of Parametric Speech Waveform Synthesis
This work addresses the computational bottleneck in real-time speech synthesis systems, offering an incremental improvement for applications requiring faster waveform generation.
The paper tackles the inefficiency of WaveNet's sequential waveform generation by proposing a continuous wavelet vocoder-based decomposition method for parametric speech synthesis, achieving a 50% reduction in synthesis time while maintaining comparable speech quality.
To date, various speech technology systems have adopted the vocoder approach, a method for synthesizing speech waveform that shows a major role in the performance of statistical parametric speech synthesis. WaveNet one of the best models that nearly resembles the human voice, has to generate a waveform in a time consuming sequential manner with an extremely complex structure of its neural networks.