CLApr 24, 2017

Detecting English Writing Styles For Non Native Speakers

arXiv:1704.07441v1
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of analyzing day-to-day language for non-native speakers, though it is incremental as it applies existing methods to new data.

The paper tackles the problem of classifying English writing styles as native or non-native, achieving 74% accuracy using simple machine learning algorithms and features derived from large-scale data sources like Wikipedia.

This paper presents the first attempt, up to our knowledge, to classify English writing styles on this scale with the challenge of classifying day to day language written by writers with different backgrounds covering various areas of topics.The paper proposes simple machine learning algorithms and simple to generate features to solve hard problems. Relying on the scale of the data available from large sources of knowledge like Wikipedia. We believe such sources of data are crucial to generate robust solutions for the web with high accuracy and easy to deploy in practice. The paper achieves 74\% accuracy classifying native versus non native speakers writing styles. Moreover, the paper shows some interesting observations on the similarity between different languages measured by the similarity of their users English writing styles. This technique could be used to show some well known facts about languages as in grouping them into families, which our experiments support.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes