CLAug 9, 2018

Building a Kannada POS Tagger Using Machine Learning and Neural Network Models

Ketan Kumar Todi, Pruthwik Mishra, Dipti Misra Sharma

arXiv:1808.03175v10.223 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work addresses the lack of quality NLP tools for Kannada, which is important for tasks like parsing and sentiment analysis, but it is incremental as it builds on existing methods.

The authors tackled the problem of part-of-speech tagging for Kannada, a low-resource language, by developing a statistical tagger using machine learning and neural network models, achieving a 6% improvement over the state-of-the-art.

POS Tagging serves as a preliminary task for many NLP applications. Kannada is a relatively poor Indian language with very limited number of quality NLP tools available for use. An accurate and reliable POS Tagger is essential for many NLP tasks like shallow parsing, dependency parsing, sentiment analysis, named entity recognition. We present a statistical POS tagger for Kannada using different machine learning and neural network models. Our Kannada POS tagger outperforms the state-of-the-art Kannada POS tagger by 6%. Our contribution in this paper is three folds - building a generic POS Tagger, comparing the performances of different modeling techniques, exploring the use of character and word embeddings together for Kannada POS Tagging.

View on arXiv PDF Code

Similar