Abubakar Isa

CL
3papers
10citations
Novelty33%
AI Score17

3 Papers

CLNov 14, 2020
A Hybrid Approach for Improved Low Resource Neural Machine Translation using Monolingual Data

Idris Abdulmumin, Bashir Shehu Galadanci, Abubakar Isa et al.

Many language pairs are low resource, meaning the amount and/or quality of available parallel data is not sufficient to train a neural machine translation (NMT) model which can reach an acceptable standard of accuracy. Many works have explored using the readily available monolingual data in either or both of the languages to improve the standard of translation models in low, and even high, resource languages. One of the most successful of such works is the back-translation that utilizes the translations of the target language monolingual data to increase the amount of the training data. The quality of the backward model which is trained on the available parallel data has been shown to determine the performance of the back-translation approach. Despite this, only the forward model is improved on the monolingual target data in standard back-translation. A previous study proposed an iterative back-translation approach for improving both models over several iterations. But unlike in the traditional back-translation, it relied on both the target and source monolingual data. This work, therefore, proposes a novel approach that enables both the backward and forward models to benefit from the monolingual target data through a hybrid of self-learning and back-translation respectively. Experimental results have shown the superiority of the proposed approach over the traditional back-translation method on English-German low resource neural machine translation. We also proposed an iterative self-learning approach that outperforms the iterative back-translation while also relying only on the monolingual target data and require the training of less models.

CLJun 4, 2020
Enhanced back-translation for low resource neural machine translation using self-training

Idris Abdulmumin, Bashir Shehu Galadanci, Abubakar Isa

Improving neural machine translation (NMT) models using the back-translations of the monolingual target data (synthetic parallel data) is currently the state-of-the-art approach for training improved translation systems. The quality of the backward system - which is trained on the available parallel data and used for the back-translation - has been shown in many studies to affect the performance of the final NMT model. In low resource conditions, the available parallel data is usually not enough to train a backward model that can produce the qualitative synthetic data needed to train a standard translation model. This work proposes a self-training strategy where the output of the backward model is used to improve the model itself through the forward translation technique. The technique was shown to improve baseline low resource IWSLT'14 English-German and IWSLT'15 English-Vietnamese backward translation models by 11.06 and 1.5 BLEUs respectively. The synthetic data generated by the improved English-German backward model was used to train a forward model which out-performed another forward model trained using standard back-translation by 2.7 BLEU.

CLNov 26, 2019
Iterative Batch Back-Translation for Neural Machine Translation: A Conceptual Model

Idris Abdulmumin, Bashir Shehu Galadanci, Abubakar Isa

An effective method to generate a large number of parallel sentences for training improved neural machine translation (NMT) systems is the use of back-translations of the target-side monolingual data. Recently, iterative back-translation has been shown to outperform standard back-translation albeit on some language pairs. This work proposes the iterative batch back-translation that is aimed at enhancing the standard iterative back-translation and enabling the efficient utilization of more monolingual data. After each iteration, improved back-translations of new sentences are added to the parallel data that will be used to train the final forward model. The work presents a conceptual model of the proposed approach.