A CRF Based POS Tagger for Code-mixed Indian Social Media Text
This work addresses POS tagging for code-mixed text in Indian languages, which is an incremental improvement in a specific domain.
The authors tackled POS tagging for code-mixed Indian social media text using a CRF-based system, achieving the highest overall average F1 score of 79.99 among 16 systems in a constrained contest.
In this work, we describe a conditional random fields (CRF) based system for Part-Of- Speech (POS) tagging of code-mixed Indian social media text as part of our participation in the tool contest on POS tagging for codemixed Indian social media text, held in conjunction with the 2016 International Conference on Natural Language Processing, IIT(BHU), India. We participated only in constrained mode contest for all three language pairs, Bengali-English, Hindi-English and Telegu-English. Our system achieves the overall average F1 score of 79.99, which is the highest overall average F1 score among all 16 systems participated in constrained mode contest.