CLNov 30, 2020

Modelling Verbal Morphology in Nen

arXiv:2011.14489v2839 citations
AI Analysis

This work addresses the need for NLP tools for the low-resource Nen language, which has a highly complex verbal morphology, by exploring the effectiveness and error patterns of current machine learning models.

This paper models the complex verbal morphology of Nen, where a transitive verb can have up to 1,740 unique forms, using state-of-the-art machine learning models for morphological reinflection. The study found that model accuracy is sensitive to the composition of training data, with different distributions of verb types leading to varying accuracies that correlate with E-complexity.

Nen verbal morphology is remarkably complex; a transitive verb can take up to 1,740 unique forms. The combined effect of having a large combinatoric space and a low-resource setting amplifies the need for NLP tools. Nen morphology utilises distributed exponence - a non-trivial means of mapping form to meaning. In this paper, we attempt to model Nen verbal morphology using state-of-the-art machine learning models for morphological reinflection. We explore and categorise the types of errors these systems generate. Our results show sensitivity to training data composition; different distributions of verb type yield different accuracies (patterning with E-complexity). We also demonstrate the types of patterns that can be inferred from the training data through the case study of syncretism.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes