Flowchase: a Mobile Application for Pronunciation Training
This addresses pronunciation training for English learners, but it appears incremental as it applies existing speech technology methods to a mobile application context.
The paper tackles the problem of providing personalized pronunciation feedback to English learners by developing Flowchase, a mobile app that uses a speech processing pipeline to analyze segmental and supra-segmental features, resulting in instant feedback based on machine learning models.
In this paper, we present a solution for providing personalized and instant feedback to English learners through a mobile application, called Flowchase, that is connected to a speech technology able to segment and analyze speech segmental and supra-segmental features. The speech processing pipeline receives linguistic information corresponding to an utterance to analyze along with a speech sample. After validation of the speech sample, a joint forced-alignment and phonetic recognition is performed thanks to a combination of machine learning models based on speech representation learning that provides necessary information for designing a feedback on a series of segmental and supra-segmental pronunciation aspects.