CLNov 6, 2021

Analyzing Architectures for Neural Machine Translation Using Low Computational Resources

Aditya Mandke, Onkar Litake, Dipali Kadam

arXiv:2111.03813v10.21 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of resource limitations for practitioners in machine translation, though it is incremental as it compares existing architectures without introducing new methods.

The study analyzed neural machine translation architectures under low computational resources, finding that transformers achieved higher accuracy but LSTMs trained faster with competitive performance, making them suitable for time-constrained scenarios.

With the recent developments in the field of Natural Language Processing, there has been a rise in the use of different architectures for Neural Machine Translation. Transformer architectures are used to achieve state-of-the-art accuracy, but they are very computationally expensive to train. Everyone cannot have such setups consisting of high-end GPUs and other resources. We train our models on low computational resources and investigate the results. As expected, transformers outperformed other architectures, but there were some surprising results. Transformers consisting of more encoders and decoders took more time to train but had fewer BLEU scores. LSTM performed well in the experiment and took comparatively less time to train than transformers, making it suitable to use in situations having time constraints.

View on arXiv PDF

Similar