Towards Supervised and Unsupervised Neural Machine Translation Baselines for Nigerian Pidgin
This work addresses the problem of limited translation tools for Nigerian Pidgin speakers, but it is incremental as it focuses on setting baselines rather than advancing state-of-the-art methods.
The authors tackled the lack of machine translation resources for Nigerian Pidgin by establishing supervised and unsupervised neural machine translation baselines between English and Nigerian Pidgin, implementing and comparing models with different tokenization methods to create a foundation for future work.
Nigerian Pidgin is arguably the most widely spoken language in Nigeria. Variants of this language are also spoken across West and Central Africa, making it a very important language. This work aims to establish supervised and unsupervised neural machine translation (NMT) baselines between English and Nigerian Pidgin. We implement and compare NMT models with different tokenization methods, creating a solid foundation for future works.