CLFeb 14, 2017

JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction

arXiv:1702.04066v1228 citations
Originality Synthesis-oriented
AI Analysis

This addresses the need for a new gold standard to assess GEC systems, benefiting researchers and developers in natural language processing, though it is incremental as it builds on existing corpus work.

The authors tackled the problem of evaluating grammatical error correction (GEC) systems by creating JFLEG, a new parallel corpus that includes a broad range of language proficiency levels and uses holistic fluency edits to make text more native-sounding, and they benchmarked four leading GEC systems on it to identify areas for improvement.

We present a new parallel corpus, JHU FLuency-Extended GUG corpus (JFLEG) for developing and evaluating grammatical error correction (GEC). Unlike other corpora, it represents a broad range of language proficiency levels and uses holistic fluency edits to not only correct grammatical errors but also make the original text more native sounding. We describe the types of corrections made and benchmark four leading GEC systems on this corpus, identifying specific areas in which they do well and how they can improve. JFLEG fulfills the need for a new gold standard to properly assess the current state of GEC.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes