CL LG SD AS MLNov 9, 2018

Native Language Identification using i-vector

Ahmed Nazim Uddin, Md Ashequr Rahman, Md. Rafidul Islam, Mohammad Ariful Haque

arXiv:1811.05540v10.23 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of identifying a speaker's native language from second-language speech, which is incremental as it applies an existing i-vector method to NLI with specific feature enhancements.

The paper tackled Native Language Identification (NLI) by proposing an i-vector based approach using MFCC and GFCC features, achieving improvements in accuracy of 21.95% and 22.81% over the baseline on a dataset with 11 native language backgrounds.

The task of determining a speaker's native language based only on his speeches in a second language is known as Native Language Identification or NLI. Due to its increasing applications in various domains of speech signal processing, this has emerged as an important research area in recent times. In this paper we have proposed an i-vector based approach to develop an automatic NLI system using MFCC and GFCC features. For evaluation of our approach, we have tested our framework on the 2016 ComParE Native language sub-challenge dataset which has English language speakers from 11 different native language backgrounds. Our proposed method outperforms the baseline system with an improvement in accuracy by 21.95% for the MFCC feature based i-vector framework and 22.81% for the GFCC feature based i-vector framework.

View on arXiv PDF

Similar