Towards Unsupervised Grammatical Error Correction using Statistical Machine Translation with Synthetic Comparable Corpus
This addresses grammatical error correction for language learners, but it is incremental as it builds on existing statistical methods with synthetic data.
The paper tackled grammatical error correction (GEC) by using unsupervised phrase-based statistical machine translation trained on a synthetic corpus, achieving an F_0.5 score of 28.31 on a low-resource test dataset.
We introduce unsupervised techniques based on phrase-based statistical machine translation for grammatical error correction (GEC) trained on a pseudo learner corpus created by Google Translation. We verified our GEC system through experiments on various GEC dataset, includi ng a low resource track of the shared task at Building Educational Applications 2019 (BEA 2019). As a result, we achieved an F_0.5 score of 28.31 points with the test data of the low resource track.