CLSep 8, 2021

Ensemble Fine-tuned mBERT for Translation Quality Estimation

Shaika Chowdhury, Naouel Baili, Brian Vannah

arXiv:2109.03914v130.5650 citations

Originality Synthesis-oriented

AI Analysis

This work addresses quality estimation for machine translation users, but it is incremental as it builds on existing mBERT methods with ensemble techniques.

The paper tackled the problem of predicting sentence-level translation quality without reference translations by proposing an ensemble of fine-tuned mBERT models, achieving comparable Pearson's correlation and beating baselines in MAE/RMSE for some language pairs.

Quality Estimation (QE) is an important component of the machine translation workflow as it assesses the quality of the translated output without consulting reference translations. In this paper, we discuss our submission to the WMT 2021 QE Shared Task. We participate in Task 2 sentence-level sub-task that challenge participants to predict the HTER score for sentence-level post-editing effort. Our proposed system is an ensemble of multilingual BERT (mBERT)-based regression models, which are generated by fine-tuning on different input settings. It demonstrates comparable performance with respect to the Pearson's correlation and beats the baseline system in MAE/ RMSE for several language pairs. In addition, we adapt our system for the zero-shot setting by exploiting target language-relevant language pairs and pseudo-reference translations.

View on arXiv PDF

Similar