CLJul 29, 2016

Connecting Phrase based Statistical Machine Translation Adaptation

arXiv:1607.08693v121 citations
Originality Incremental advance
AI Analysis

This work addresses domain adaptation for machine translation practitioners, offering incremental improvements over existing methods.

The paper tackled the problem of domain adaptation in statistical machine translation by proposing a phrase-based adaptation method, which improved performance by up to +1.6 BLEU over the baseline and +0.9 over existing methods on IWSLT/NIST datasets.

Although more additional corpora are now available for Statistical Machine Translation (SMT), only the ones which belong to the same or similar domains with the original corpus can indeed enhance SMT performance directly. Most of the existing adaptation methods focus on sentence selection. In comparison, phrase is a smaller and more fine grained unit for data selection, therefore we propose a straightforward and efficient connecting phrase based adaptation method, which is applied to both bilingual phrase pair and monolingual n-gram adaptation. The proposed method is evaluated on IWSLT/NIST data sets, and the results show that phrase based SMT performance are significantly improved (up to +1.6 in comparison with phrase based SMT baseline system and +0.9 in comparison with existing methods).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes