CL SD ASFeb 1, 2020

Deep segmental phonetic posterior-grams based discovery of non-categories in L2 English speech

Xu Li, Xixin Wu, Xunying Liu, Helen Meng

arXiv:2002.00205v10.2

Originality Incremental advance

AI Analysis

This work addresses non-categorical errors in L2 speech for language learners and educators, but it is incremental as it extends existing categorical approaches.

The paper tackled the problem of non-categorical errors in second language (L2) English speech, which are often overlooked in mispronunciation detection, by using segmental phonetic posterior-grams (SPPGs) to identify non-categories, resulting in increased confusion degree by 7.3% and 7.5% under two measures.

Second language (L2) speech is often labeled with the native, phone categories. However, in many cases, it is difficult to decide on a categorical phone that an L2 segment belongs to. These segments are regarded as non-categories. Most existing approaches for Mispronunciation Detection and Diagnosis (MDD) are only concerned with categorical errors, i.e. a phone category is inserted, deleted or substituted by another. However, non-categorical errors are not considered. To model these non-categorical errors, this work aims at exploring non-categorical patterns to extend the categorical phone set. We apply a phonetic segment classifier to generate segmental phonetic posterior-grams (SPPGs) to represent phone segment-level information. And then we explore the non-categories by looking for the SPPGs with more than one peak. Compared with the baseline system, this approach explores more non-categorical patterns, and also perceptual experimental results show that the explored non-categories are more accurate with increased confusion degree by 7.3% and 7.5% under two different measures. Finally, we preliminarily analyze the reason behind those non-categories.

View on arXiv PDF

Similar