The NYU-CUBoulder Systems for SIGMORPHON 2020 Task 0 and Task 2
This work addresses challenges in natural language processing for linguists and AI researchers, but it is incremental as it builds on existing transformer models with minor modifications.
The paper tackled morphological inflection and paradigm completion tasks in the SIGMORPHON 2020 competition, achieving 6th out of 23 systems for inflection and 2nd out of 7 for paradigm completion.
We describe the NYU-CUBoulder systems for the SIGMORPHON 2020 Task 0 on typologically diverse morphological inflection and Task 2 on unsupervised morphological paradigm completion. The former consists of generating morphological inflections from a lemma and a set of morphosyntactic features describing the target form. The latter requires generating entire paradigms for a set of given lemmas from raw text alone. We model morphological inflection as a sequence-to-sequence problem, where the input is the sequence of the lemma's characters with morphological tags, and the output is the sequence of the inflected form's characters. First, we apply a transformer model to the task. Second, as inflected forms share most characters with the lemma, we further propose a pointer-generator transformer model to allow easy copying of input characters. Our best performing system for Task 0 is placed 6th out of 23 systems. We further use our inflection systems as subcomponents of approaches for Task 2. Our best performing system for Task 2 is the 2nd best out of 7 submissions.