CLApr 18, 2021

MT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs

Zewen Chi, Li Dong, Shuming Ma, Shaohan Huang Xian-Ling Mao, Heyan Huang, Furu Wei

arXiv:2104.08692v231.3687 citations

Originality Incremental advance

AI Analysis

This work addresses the need for better multilingual models for tasks like classification and summarization, but it is incremental over existing methods like mT5.

The paper tackles the problem of improving multilingual text-to-text transfer by introducing mT6, which enhances mT5 with translation pairs and new pre-training tasks, resulting in improved cross-lingual transferability across eight benchmark datasets.

Multilingual T5 (mT5) pretrains a sequence-to-sequence model on massive monolingual texts, which has shown promising results on many cross-lingual tasks. In this paper, we improve multilingual text-to-text transfer Transformer with translation pairs (mT6). Specifically, we explore three cross-lingual text-to-text pre-training tasks, namely, machine translation, translation pair span corruption, and translation span corruption. In addition, we propose a partially non-autoregressive objective for text-to-text pre-training. We evaluate the methods on eight multilingual benchmark datasets, including sentence classification, named entity recognition, question answering, and abstractive summarization. Experimental results show that the proposed mT6 improves cross-lingual transferability over mT5.

View on arXiv PDF

Similar