CL LGOct 28, 2019

Adversarial Multitask Learning for Joint Multi-Feature and Multi-Dialect Morphological Modeling

arXiv:1910.12702v131.11096 citationsh-index: 55

Originality Incremental advance

AI Analysis

This addresses the problem of limited resources and noise in dialectal variants for morphologically rich languages, offering an incremental improvement in cross-dialectal modeling.

The paper tackled morphological tagging for morphologically rich languages with dialectal variations by using multitask learning and adversarial training, achieving state-of-the-art results for Modern Standard Arabic and Egyptian Arabic with more significant improvements on smaller datasets.

Morphological tagging is challenging for morphologically rich languages due to the large target space and the need for more training data to minimize model sparsity. Dialectal variants of morphologically rich languages suffer more as they tend to be more noisy and have less resources. In this paper we explore the use of multitask learning and adversarial training to address morphological richness and dialectal variations in the context of full morphological tagging. We use multitask learning for joint morphological modeling for the features within two dialects, and as a knowledge-transfer scheme for cross-dialectal modeling. We use adversarial training to learn dialect invariant features that can help the knowledge-transfer scheme from the high to low-resource variants. We work with two dialectal variants: Modern Standard Arabic (high-resource "dialect") and Egyptian Arabic (low-resource dialect) as a case study. Our models achieve state-of-the-art results for both. Furthermore, adversarial training provides more significant improvement when using smaller training datasets in particular.

View on arXiv PDF

Similar