CLJun 2, 2021

Minimax and Neyman-Pearson Meta-Learning for Outlier Languages

Edoardo Maria Ponti, Rahul Aralikatte, Disha Shrivastava, Siva Reddy, Anders Søgaard

arXiv:2106.01051v11.416 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of improving NLP performance for outlier, low-resource languages, which is an incremental advancement in meta-learning for cross-lingual applications.

The authors tackled the problem of cross-lingual NLP for low-resource languages by addressing the i.i.d. assumption in MAML, which is ill-suited for outlier languages, and proposed Minimax and Neyman-Pearson MAML variants to improve robustness, resulting in gains in average and minimum performance for tasks like part-of-speech tagging and question answering in zero- and few-shot settings.

Model-agnostic meta-learning (MAML) has been recently put forth as a strategy to learn resource-poor languages in a sample-efficient fashion. Nevertheless, the properties of these languages are often not well represented by those available during training. Hence, we argue that the i.i.d. assumption ingrained in MAML makes it ill-suited for cross-lingual NLP. In fact, under a decision-theoretic framework, MAML can be interpreted as minimising the expected risk across training languages (with a uniform prior), which is known as Bayes criterion. To increase its robustness to outlier languages, we create two variants of MAML based on alternative criteria: Minimax MAML reduces the maximum risk across languages, while Neyman-Pearson MAML constrains the risk in each language to a maximum threshold. Both criteria constitute fully differentiable two-player games. In light of this, we propose a new adaptive optimiser solving for a local approximation to their Nash equilibrium. We evaluate both model variants on two popular NLP tasks, part-of-speech tagging and question answering. We report gains for their average and minimum performance across low-resource languages in zero- and few-shot settings, compared to joint multi-source transfer and vanilla MAML.

View on arXiv PDF Code

Similar