Neural Machine Translation for Coptic-French: Strategies for Low-Resource Ancient Languages
It provides practical insights for developing translation tools for historical languages, though it is incremental as it applies existing methods to a new language pair.
This paper tackled the problem of translating the low-resource ancient language Coptic to French by systematically evaluating strategies like pivot vs. direct translation and fine-tuning with varied data, resulting in significant enhancements in translation quality.
This paper presents the first systematic study of strategies for translating Coptic into French. Our comprehensive pipeline systematically evaluates: pivot versus direct translation, the impact of pre-training, the benefits of multi-version fine-tuning, and model robustness to noise. Utilizing aligned biblical corpora, we demonstrate that fine-tuning with a stylistically-varied and noise-aware training corpus significantly enhances translation quality. Our findings provide crucial practical insights for developing translation tools for historical languages in general.