CLJun 4

English-to-Prakrit Machine Translation via Multilingual Transfer Learning

arXiv:2606.0603876.1Has Code
Predicted impact top 23% in CL · last 90 daysOriginality Synthesis-oriented
AI Analysis

For researchers working on low-resource classical language translation, this provides a practical transfer learning approach, though results are incremental due to small dataset size.

The authors adapt IndicTrans2 for English-to-Prakrit translation by routing Prakrit to the Hindi language tag, achieving BLEU improvements on a small parallel corpus, but note limitations from data scarcity and dialect mismatch.

We study English-to-Prakrit machine translation in a low-resource setting where the target language is unsupported by IndicTrans2. We adapt the multilingual model by mapping Prakrit to the Hindi language tag (hin_Deva) without modifying the tokenizer, vocabulary, or architecture. Using a 1,474-pair Maharashtri Prakrit parallel corpus and evaluation on a 20-sample Ardhamagadhi test set, we report corpus BLEU improvements over an untuned baseline. The results indicate that script-compatible language routing can enable feasible transfer to unsupported classical languages, while highlighting limitations due to data scarcity and dialect mismatch. Our code and trained models are released to the public for further exploration https://github.com/D3v1s0m/indictrans2-prakrit-mt.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes