CLLGDec 26, 2019

Amharic-Arabic Neural Machine Translation

arXiv:1912.13161v110 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This work addresses translation for the Amharic-Arabic language pair, which is incremental as it applies existing methods to a new data-scarce domain.

The paper tackles the problem of Amharic-Arabic neural machine translation by developing LSTM and GRU models using an attention-based encoder-decoder architecture, achieving a BLEU score of 12% for LSTM, outperforming GRU (11%) and Google Translation (6%).

Many automatic translation works have been addressed between major European language pairs, by taking advantage of large scale parallel corpora, but very few research works are conducted on the Amharic-Arabic language pair due to its parallel data scarcity. Two Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU) based Neural Machine Translation (NMT) models are developed using Attention-based Encoder-Decoder architecture which is adapted from the open-source OpenNMT system. In order to perform the experiment, a small parallel Quranic text corpus is constructed by modifying the existing monolingual Arabic text and its equivalent translation of Amharic language text corpora available on Tanzile. LSTM and GRU based NMT models and Google Translation system are compared and found that LSTM based OpenNMT outperforms GRU based OpenNMT and Google Translation system, with a BLEU score of 12%, 11%, and 6% respectively.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes