SDASDec 9, 2018

Increase Apparent Public Speaking Fluency By Speech Augmentation

arXiv:1812.03415v26 citations
AI Analysis

This addresses the problem of improving public speaking fluency for non-professional speakers, representing an incremental advancement in speech processing.

The paper tackles the problem of non-professional speakers producing disfluent speech by developing a system that manipulates disfluencies like filler words and pauses to enhance fluency. The result is a significant increase in speech fluency, as quantitatively measured by reduced pause and filler rates.

Fluent and confident speech is desirable to every speaker. But professional speech delivering requires a great deal of experience and practice. In this paper, we propose a speech stream manipulation system which can help non-professional speakers to produce fluent, professional-like speech content, in turn contributing towards better listener engagement and comprehension. We propose to achieve this task by manipulating the disfluencies in human speech, like the sounds 'uh' and 'um', the filler words and awkward long silences. Given any unrehearsed speech we segment and silence the filled pauses and doctor the duration of imposed silence as well as other long pauses ('disfluent') by a predictive model learned using professional speech dataset. Finally, we output a audio stream in which speaker sounds more fluent, confident and practiced compared to the original speech he/she recorded. According to our quantitative evaluation, we significantly increase the fluency of speech by reducing rate of pauses and fillers.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes