CL AI LGSep 19, 2023

Self-Augmentation Improves Zero-Shot Cross-Lingual Transfer

Fei Wang, Kuan-Hao Huang, Kai-Wei Chang, Muhao Chen

arXiv:2309.10891v120.8127 citationsh-index: 64Has Code

Originality Incremental advance

AI Analysis

This addresses the challenge of expensive alignment data for low-resource languages, offering a more accessible solution for multilingual NLP applications.

The paper tackles the problem of zero-shot cross-lingual transfer in multilingual NLP by proposing SALT, a method that improves transferability without external data, achieving enhanced performance on tasks like XNLI and PAWS-X.

Zero-shot cross-lingual transfer is a central task in multilingual NLP, allowing models trained in languages with more sufficient training resources to generalize to other low-resource languages. Earlier efforts on this task use parallel corpora, bilingual dictionaries, or other annotated alignment data to improve cross-lingual transferability, which are typically expensive to obtain. In this paper, we propose a simple yet effective method, SALT, to improve the zero-shot cross-lingual transfer of the multilingual pretrained language models without the help of such external data. By incorporating code-switching and embedding mixup with self-augmentation, SALT effectively distills cross-lingual knowledge from the multilingual PLM and enhances its transferability on downstream tasks. Experimental results on XNLI and PAWS-X show that our method is able to improve zero-shot cross-lingual transferability without external data. Our code is available at https://github.com/luka-group/SALT.

View on arXiv PDF Code

Similar