CLJul 23, 2019

CMU-01 at the SIGMORPHON 2019 Shared Task on Crosslinguality and Context in Morphology

Aditi Chaudhary, Elizabeth Salesky, Gayatri Bhat, David R. Mortensen, Jaime G. Carbonell, Yulia Tsvetkov

arXiv:1907.10129v131.01092 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of training deep neural models for under-resourced languages in morphology, which is incremental as it applies existing methods to new data with transfer learning.

The paper tackled the problem of morphological analysis and lemmatization in context for 107 under-resourced treebanks by proposing a multilingual transfer training regime, achieving results through a hierarchical neural CRF model that predicts coarse-grained features independently.

This paper presents the submission by the CMU-01 team to the SIGMORPHON 2019 task 2 of Morphological Analysis and Lemmatization in Context. This task requires us to produce the lemma and morpho-syntactic description of each token in a sequence, for 107 treebanks. We approach this task with a hierarchical neural conditional random field (CRF) model which predicts each coarse-grained feature (eg. POS, Case, etc.) independently. However, most treebanks are under-resourced, thus making it challenging to train deep neural models for them. Hence, we propose a multi-lingual transfer training regime where we transfer from multiple related languages that share similar typology.

View on arXiv PDF Code

Similar