Transparent pronunciation scoring using articulatorily weighted phoneme edit distance
This addresses the need for clear feedback in gamified language learning for children, though it is incremental as it builds on existing phoneme recognition and edit distance techniques.
The paper tackles the problem of providing transparent feedback for pronunciation learning by developing a scoring system based on weighted phoneme edit distance, which can produce human-readable mispronunciation lists and is compared to established black-box methods.
For researching effects of gamification in foreign language learning for children in the "Say It Again, Kid!" project we developed a feedback paradigm that can drive gameplay in pronunciation learning games. We describe our scoring system based on the difference between a reference phone sequence and the output of a multilingual CTC phoneme recogniser. We present a white-box scoring model of mapped weighted Levenshtein edit distance between reference and error with error weights for articulatory differences computed from a training set of scored utterances. The system can produce a human-readable list of each detected mispronunciation's contribution to the utterance score. We compare our scoring method to established black box methods.