CLAug 21, 2013

An Investigation of the Sampling-Based Alignment Method and Its Contributions

arXiv:1308.4479v11 citations

Originality Incremental advance

AI Analysis

This work addresses a specific bottleneck in machine translation alignment, offering an incremental improvement for researchers and practitioners in the field.

The paper tackles the problem of increasing n-gram alignments in phrase translation tables for statistical machine translation by enforcing alignments across subtables and using a standard normal distribution to allocate alignment time, resulting in improved evaluation results compared to the original sampling-based method.

By investigating the distribution of phrase pairs in phrase translation tables, the work in this paper describes an approach to increase the number of n-gram alignments in phrase translation tables output by a sampling-based alignment method. This approach consists in enforcing the alignment of n-grams in distinct translation subtables so as to increase the number of n-grams. Standard normal distribution is used to allot alignment time among translation subtables, which results in adjustment of the distribution of n- grams. This leads to better evaluation results on statistical machine translation tasks than the original sampling-based alignment approach. Furthermore, the translation quality obtained by merging phrase translation tables computed from the sampling-based alignment method and from MGIZA++ is examined.

View on arXiv PDF

Similar