Syntax-based data augmentation for Hungarian-English machine translation
This work addresses machine translation for Hungarian, a low-resource language, but appears incremental as it builds on existing methods with new data.
The researchers tackled Hungarian-English machine translation using Transformer models on the Hunglish2 corpus, achieving BLEU scores of 40.0 for Hungarian-English and 33.4 for English-Hungarian, and explored syntax-based data augmentation.
We train Transformer-based neural machine translation models for Hungarian-English and English-Hungarian using the Hunglish2 corpus. Our best models achieve a BLEU score of 40.0 on HungarianEnglish and 33.4 on English-Hungarian. Furthermore, we present results on an ongoing work about syntax-based augmentation for neural machine translation. Both our code and models are publicly available.