CLDec 23, 2016

A CRF Based POS Tagger for Code-mixed Indian Social Media Text

arXiv:1612.07956v16 citations
Originality Synthesis-oriented
AI Analysis

This work addresses POS tagging for code-mixed text in Indian languages, which is an incremental improvement in a specific domain.

The authors tackled POS tagging for code-mixed Indian social media text using a CRF-based system, achieving the highest overall average F1 score of 79.99 among 16 systems in a constrained contest.

In this work, we describe a conditional random fields (CRF) based system for Part-Of- Speech (POS) tagging of code-mixed Indian social media text as part of our participation in the tool contest on POS tagging for codemixed Indian social media text, held in conjunction with the 2016 International Conference on Natural Language Processing, IIT(BHU), India. We participated only in constrained mode contest for all three language pairs, Bengali-English, Hindi-English and Telegu-English. Our system achieves the overall average F1 score of 79.99, which is the highest overall average F1 score among all 16 systems participated in constrained mode contest.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes