CLOct 21, 2020

NeuSpell: A Neural Spelling Correction Toolkit

arXiv:2010.11085v1998 citationsHas Code
Originality Incremental advance
AI Analysis

This provides an incremental improvement for practitioners needing better spelling correction tools, with applications like combating adversarial misspellings.

The authors tackled the problem of spelling correction in English by introducing NeuSpell, a toolkit with ten models that improves correction rates by 9% using synthetic context-based training and by another 3% with richer contextual representations.

We introduce NeuSpell, an open-source toolkit for spelling correction in English. Our toolkit comprises ten different models, and benchmarks them on naturally occurring misspellings from multiple sources. We find that many systems do not adequately leverage the context around the misspelt token. To remedy this, (i) we train neural models using spelling errors in context, synthetically constructed by reverse engineering isolated misspellings; and (ii) use contextual representations. By training on our synthetic examples, correction rates improve by 9% (absolute) compared to the case when models are trained on randomly sampled character perturbations. Using richer contextual representations boosts the correction rate by another 3%. Our toolkit enables practitioners to use our proposed and existing spelling correction systems, both via a unified command line, as well as a web interface. Among many potential applications, we demonstrate the utility of our spell-checkers in combating adversarial misspellings. The toolkit can be accessed at neuspell.github.io. Code and pretrained models are available at http://github.com/neuspell/neuspell.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes