CLAINov 26, 2019

Feature-Rich Part-of-speech Tagging for Morphologically Complex Languages: Application to Bulgarian

arXiv:1911.11503v11086 citations
Originality Incremental advance
AI Analysis

This improves NLP tools for Bulgarian speakers and researchers by significantly advancing state-of-the-art tagging accuracy.

The paper tackled part-of-speech tagging for Bulgarian, a morphologically complex language, by using 680 morpho-syntactic tags and combining a large lexicon with linguistic knowledge and guided learning, achieving an accuracy of 97.98%.

We present experiments with part-of-speech tagging for Bulgarian, a Slavic language with rich inflectional and derivational morphology. Unlike most previous work, which has used a small number of grammatical categories, we work with 680 morpho-syntactic tags. We combine a large morphological lexicon with prior linguistic knowledge and guided learning from a POS-annotated corpus, achieving accuracy of 97.98%, which is a significant improvement over the state-of-the-art for Bulgarian.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes