mPMR: A Multilingual Pre-trained Machine Reader at Scale
This work addresses the challenge of enabling multilingual language models to better handle tasks like sequence classification and span extraction across languages, which is incremental as it builds on existing mPLMs.
The paper tackles the problem of cross-lingual generalization in natural language understanding by introducing mPMR, a multilingual pre-trained machine reader that directly inherits NLU capabilities from MRC-style pre-training, resulting in improved performance for target languages.
We present multilingual Pre-trained Machine Reader (mPMR), a novel method for multilingual machine reading comprehension (MRC)-style pre-training. mPMR aims to guide multilingual pre-trained language models (mPLMs) to perform natural language understanding (NLU) including both sequence classification and span extraction in multiple languages. To achieve cross-lingual generalization when only source-language fine-tuning data is available, existing mPLMs solely transfer NLU capability from a source language to target languages. In contrast, mPMR allows the direct inheritance of multilingual NLU capability from the MRC-style pre-training to downstream tasks. Therefore, mPMR acquires better NLU capability for target languages. mPMR also provides a unified solver for tackling cross-lingual span extraction and sequence classification, thereby enabling the extraction of rationales to explain the sentence-pair classification process.