String Transduction with Target Language Models and Insertion Handling
This addresses the problem of improving accuracy in NLP tasks like inflection generation for researchers and practitioners, though it appears incremental as it builds on existing methods with enhancements.
The paper tackled character-level sequence-to-sequence transduction tasks by using target language models from unannotated corpora and precise data alignment, achieving state-of-the-art results on cognate projection, inflection generation, and phoneme-to-grapheme conversion.
Many character-level tasks can be framed as sequence-to-sequence transduction, where the target is a word from a natural language. We show that leveraging target language models derived from unannotated target corpora, combined with a precise alignment of the training data, yields state-of-the art results on cognate projection, inflection generation, and phoneme-to-grapheme conversion.