CL AISep 16, 2021

Language Models are Few-shot Multilingual Learners

Genta Indra Winata, Andrea Madotto, Zhaojiang Lin, Rosanne Liu, Jason Yosinski, Pascale Fung

arXiv:2109.07684v131.6681 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the problem of enabling language models to handle multilingual tasks efficiently for NLP applications, though it is incremental as it builds on existing few-shot learning capabilities.

The study evaluated GPT and T5 models for multilingual classification without training, showing they can predict non-English samples from few English examples, with results significantly better than random and competitive with state-of-the-art cross-lingual models.

General-purpose language models have demonstrated impressive capabilities, performing on par with state-of-the-art approaches on a range of downstream natural language processing (NLP) tasks and benchmarks when inferring instructions from very few examples. Here, we evaluate the multilingual skills of the GPT and T5 models in conducting multi-class classification on non-English languages without any parameter updates. We show that, given a few English examples as context, pre-trained language models can predict not only English test samples but also non-English ones. Finally, we find the in-context few-shot cross-lingual prediction results of language models are significantly better than random prediction, and they are competitive compared to the existing state-of-the-art cross-lingual models.

View on arXiv PDF Code

Similar