Grammatical Error Correction as GAN-like Sequence Labeling
This work addresses a specific training challenge in GEC for applications like language learning tools, though it is incremental as it builds on existing sequence labeling methods.
The paper tackled the problem of training-inference mismatch in Grammatical Error Correction (GEC) sequence labeling models, which arises from iterative correction exposing models to sentences with progressively fewer errors, and proposed a GAN-like model that samples from real error distributions to improve training, resulting in state-of-the-art performance on several benchmarks.
In Grammatical Error Correction (GEC), sequence labeling models enjoy fast inference compared to sequence-to-sequence models; however, inference in sequence labeling GEC models is an iterative process, as sentences are passed to the model for multiple rounds of correction, which exposes the model to sentences with progressively fewer errors at each round. Traditional GEC models learn from sentences with fixed error rates. Coupling this with the iterative correction process causes a mismatch between training and inference that affects final performance. In order to address this mismatch, we propose a GAN-like sequence labeling model, which consists of a grammatical error detector as a discriminator and a grammatical error labeler with Gumbel-Softmax sampling as a generator. By sampling from real error distributions, our errors are more genuine compared to traditional synthesized GEC errors, thus alleviating the aforementioned mismatch and allowing for better training. Our results on several evaluation benchmarks demonstrate that our proposed approach is effective and improves the previous state-of-the-art baseline.