CLOct 23, 2023

System Combination via Quality Estimation for Grammatical Error Correction

arXiv:2310.14947v121.2133 citationsh-index: 6Has Code

Originality Incremental advance

AI Analysis

This improves GEC accuracy for language learners and writers, but it is incremental as it builds on existing quality estimation methods.

The paper tackled the problem of combining multiple grammatical error correction (GEC) systems using quality estimation, and the result was a new model called GRECO that achieved state-of-the-art F0.5 scores on CoNLL-2014 and BEA-2019 test sets.

Quality estimation models have been developed to assess the corrections made by grammatical error correction (GEC) models when the reference or gold-standard corrections are not available. An ideal quality estimator can be utilized to combine the outputs of multiple GEC systems by choosing the best subset of edits from the union of all edits proposed by the GEC base systems. However, we found that existing GEC quality estimation models are not good enough in differentiating good corrections from bad ones, resulting in a low F0.5 score when used for system combination. In this paper, we propose GRECO, a new state-of-the-art quality estimation model that gives a better estimate of the quality of a corrected sentence, as indicated by having a higher correlation to the F0.5 score of a corrected sentence. It results in a combined GEC system with a higher F0.5 score. We also propose three methods for utilizing GEC quality estimation models for system combination with varying generality: model-agnostic, model-agnostic with voting bias, and model-dependent method. The combined GEC system outperforms the state of the art on the CoNLL-2014 test set and the BEA-2019 test set, achieving the highest F0.5 scores published to date.

View on arXiv PDF Code

Similar