CLJun 20, 2024

Dravidian language family through Universal Dependencies lens

arXiv:2406.14680v1
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of limited linguistic resources for Dravidian languages, spoken by over 200 million people, by incrementally extending the UD framework to include more languages from this family.

The paper tackles the underrepresentation of Dravidian languages in the Universal Dependencies project by examining their morphological and syntactic features to explore annotation within the UD framework, with the result being a proposal for integrating these languages into a multilingual NLP resource that currently supports 114 languages.

The Universal Dependencies (UD) project aims to create a cross-linguistically consistent dependency annotation for multiple languages, to facilitate multilingual NLP. It currently supports 114 languages. Dravidian languages are spoken by over 200 million people across the word, and yet there are only two languages from this family in UD. This paper examines some of the morphological and syntactic features of Dravidian languages and explores how they can be annotated in the UD framework.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes