CL LG MLJun 7, 2020

Growing Together: Modeling Human Language Learning With n-Best Multi-Checkpoint Machine Translation

El Moatez Billah Nagoudi, Muhammad Abdul-Mageed, Hasan Cavusoglu

arXiv:2006.04050v131.0997 citations

Originality Synthesis-oriented

AI Analysis

This work addresses language education by providing a method to simulate human learning stages, though it is incremental as it builds on existing ensemble and checkpoint techniques.

The paper tackled the problem of modeling human language learning by using an ensemble of machine translation checkpoints at different training stages to generate translations with varying fluency levels, achieving a macro F1 score of 37.57 on an English to Portuguese test set, outperforming a baseline system with 21.30.

We describe our submission to the 2020 Duolingo Shared Task on Simultaneous Translation And Paraphrase for Language Education (STAPLE) (Mayhew et al., 2020). We view MT models at various training stages (i.e., checkpoints) as human learners at different levels. Hence, we employ an ensemble of multi-checkpoints from the same model to generate translation sequences with various levels of fluency. From each checkpoint, for our best model, we sample n-Best sequences (n=10) with a beam width =100. We achieve 37.57 macro F1 with a 6 checkpoint model ensemble on the official English to Portuguese shared task test data, outperforming a baseline Amazon translation system of 21.30 macro F1 and ultimately demonstrating the utility of our intuitive method.

View on arXiv PDF

Similar