Building Odia Shallow Parser
This work addresses a resource gap for NLP applications in Odia, but it is incremental as it applies existing methods to a new dataset.
The paper tackled the lack of annotated corpora for shallow parsing in Odia, a resource-poor Indian language, by creating a quality POS and chunk annotated corpus and developing baseline systems for POS tagging and chunking.
Shallow parsing is an essential task for many NLP applications like machine translation, summarization, sentiment analysis, aspect identification and many more. Quality annotated corpora is critical for building accurate shallow parsers. Many Indian languages are resource poor with respect to the availability of corpora in general. So, this paper is an attempt towards creating quality corpora for shallow parsers. The contribution of this paper is two folds: creation pos and chunk annotated corpora for Odia and development of baseline systems for pos tagging and chunking in Odia.