Neural Transductive Learning and Beyond: Morphological Generation in the Minimal-Resource Setting
This addresses the challenge of paradigm completion for low-resource languages, offering incremental improvements in morphological generation.
The paper tackles the problem of morphological generation in minimal-resource settings, where neural seq2seq models struggle with small training sets, and proposes two new methods that outperform the previous state of the art by up to 9.71% absolute accuracy on a 52-language benchmark.
Neural state-of-the-art sequence-to-sequence (seq2seq) models often do not perform well for small training sets. We address paradigm completion, the morphological task of, given a partial paradigm, generating all missing forms. We propose two new methods for the minimal-resource setting: (i) Paradigm transduction: Since we assume only few paradigms available for training, neural seq2seq models are able to capture relationships between paradigm cells, but are tied to the idiosyncracies of the training set. Paradigm transduction mitigates this problem by exploiting the input subset of inflected forms at test time. (ii) Source selection with high precision (SHIP): Multi-source models which learn to automatically select one or multiple sources to predict a target inflection do not perform well in the minimal-resource setting. SHIP is an alternative to identify a reliable source if training data is limited. On a 52-language benchmark dataset, we outperform the previous state of the art by up to 9.71% absolute accuracy.