Comparison of Grammatical Error Correction Using Back-Translation Models
This work addresses the lack of parallel data in GEC by exploring how different back-translation models affect correction tendencies, offering incremental insights for improving pseudo data generation.
The study compared grammatical error correction (GEC) models trained on pseudo data generated by different back-translation models (Transformer, CNN, LSTM), finding that correction tendencies varied by error type and that combining pseudo data from multiple models improved or interpolated F_0.5 scores compared to single models.
Grammatical error correction (GEC) suffers from a lack of sufficient parallel data. Therefore, GEC studies have developed various methods to generate pseudo data, which comprise pairs of grammatical and artificially produced ungrammatical sentences. Currently, a mainstream approach to generate pseudo data is back-translation (BT). Most previous GEC studies using BT have employed the same architecture for both GEC and BT models. However, GEC models have different correction tendencies depending on their architectures. Thus, in this study, we compare the correction tendencies of the GEC models trained on pseudo data generated by different BT models, namely, Transformer, CNN, and LSTM. The results confirm that the correction tendencies for each error type are different for every BT model. Additionally, we examine the correction tendencies when using a combination of pseudo data generated by different BT models. As a result, we find that the combination of different BT models improves or interpolates the F_0.5 scores of each error type compared with that of single BT models with different seeds.