Improved Accent Classification Combining Phonetic Vowels with Acoustic Features
This work addresses accent classification for speech processing applications, but it is incremental as it builds on existing acoustic and phonetic integration approaches.
The paper tackles accent classification by combining phonetic vowel information with enhanced acoustic features, achieving 54% accuracy on 7 accent types from the FAE corpus with 20-second test inputs, which is competitive with state-of-the-art methods.
Researches have shown accent classification can be improved by integrating semantic information into pure acoustic approach. In this work, we combine phonetic knowledge, such as vowels, with enhanced acoustic features to build an improved accent classification system. The classifier is based on Gaussian Mixture Model-Universal Background Model (GMM-UBM), with normalized Perceptual Linear Predictive (PLP) features. The features are further optimized by Principle Component Analysis (PCA) and Hetroscedastic Linear Discriminant Analysis (HLDA). Using 7 major types of accented speech from the Foreign Accented English (FAE) corpus, the system achieves classification accuracy 54% with input test data as short as 20 seconds, which is competitive to the state of the art in this field.