CLLGJan 18, 2022

Syntax-based data augmentation for Hungarian-English machine translation

arXiv:2201.06876v12 citations
Originality Synthesis-oriented
AI Analysis

This work addresses machine translation for Hungarian, a low-resource language, but appears incremental as it builds on existing methods with new data.

The researchers tackled Hungarian-English machine translation using Transformer models on the Hunglish2 corpus, achieving BLEU scores of 40.0 for Hungarian-English and 33.4 for English-Hungarian, and explored syntax-based data augmentation.

We train Transformer-based neural machine translation models for Hungarian-English and English-Hungarian using the Hunglish2 corpus. Our best models achieve a BLEU score of 40.0 on HungarianEnglish and 33.4 on English-Hungarian. Furthermore, we present results on an ongoing work about syntax-based augmentation for neural machine translation. Both our code and models are publicly available.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes