Automated Prediction of Medieval Arabic Diacritics
This work addresses the specific challenge of automated diacritization for Medieval Arabic, which is incremental as it builds on existing methods with a focus on context size optimization.
The study tackled the problem of diacritizing Medieval Arabic text by using a character-level neural machine translation approach with an LSTM-based bi-directional RNN architecture, resulting in improved performance over an online baseline tool.
This study uses a character level neural machine translation approach trained on a long short-term memory-based bi-directional recurrent neural network architecture for diacritization of Medieval Arabic. The results improve from the online tool used as a baseline. A diacritization model have been published openly through an easy to use Python package available on PyPi and Zenodo. We have found that context size should be considered when optimizing a feasible prediction model.