A Light Sliding-Window Part-of-Speech Tagger for the Apertium Free/Open-Source Machine Translation Platform
This work addresses tagging accuracy for users of the Apertium free/open-source machine translation platform, but it appears incremental as it builds on existing methods with specific modifications.
The paper tackles part-of-speech tagging for the Apertium machine translation platform by implementing a light sliding-window tagger and proposing a new method to incorporate linguistic rules, achieving performance comparisons under various settings including against a traditional HMM tagger.
This paper describes a free/open-source implementation of the light sliding-window (LSW) part-of-speech tagger for the Apertium free/open-source machine translation platform. Firstly, the mechanism and training process of the tagger are reviewed, and a new method for incorporating linguistic rules is proposed. Secondly, experiments are conducted to compare the performances of the tagger under different window settings, with or without Apertium-style "forbid" rules, with or without Constraint Grammar, and also with respect to the traditional HMM tagger in Apertium.