CLMay 18, 2023

mLongT5: A Multilingual and Efficient Text-To-Text Transformer for Longer Sequences

arXiv:2305.11129v2135 citations
Originality Synthesis-oriented
AI Analysis

This addresses the problem of handling long text inputs in multilingual NLP tasks, but it is incremental as it builds upon existing architectures and datasets.

The researchers developed mLongT5, a multilingual text-to-text transformer for longer sequences, by combining LongT5's architecture with mT5's multilingual datasets and UL2's pretraining tasks. Results showed stronger performance compared to existing multilingual models like mBART and M-BERT on multilingual summarization and question-answering tasks.

We present our work on developing a multilingual, efficient text-to-text transformer that is suitable for handling long inputs. This model, called mLongT5, builds upon the architecture of LongT5, while leveraging the multilingual datasets used for pretraining mT5 and the pretraining tasks of UL2. We evaluate this model on a variety of multilingual summarization and question-answering tasks, and the results show stronger performance for mLongT5 when compared to existing multilingual models such as mBART or M-BERT.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes