CLJul 22, 2017

Native Language Identification on Text and Speech

arXiv:1707.07182v11087 citations
Originality Synthesis-oriented
AI Analysis

This is an incremental improvement for identifying native languages from text and speech in computational linguistics.

The paper tackled native language identification by developing an ensemble of SVM classifiers using character n-grams, achieving 83.58% accuracy and ranking 3rd in the NLI Shared Task 2017.

This paper presents an ensemble system combining the output of multiple SVM classifiers to native language identification (NLI). The system was submitted to the NLI Shared Task 2017 fusion track which featured students essays and spoken responses in form of audio transcriptions and iVectors by non-native English speakers of eleven native languages. Our system competed in the challenge under the team name ZCD and was based on an ensemble of SVM classifiers trained on character n-grams achieving 83.58% accuracy and ranking 3rd in the shared task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes