CLAug 10, 2015

Improve the Evaluation of Fluency Using Entropy for Machine Translation Evaluation Metrics

Hui Yu, Xiaofeng Wu, Wenbin Jiang, Qun Liu, Shouxun Lin

arXiv:1508.02225v22.21 citations

Originality Incremental advance

AI Analysis

This work addresses the need for better fluency evaluation in machine translation metrics, offering an incremental improvement for researchers and practitioners in natural language processing.

The paper tackles the problem that existing automatic evaluation metrics like BLEU and METEOR inadequately reflect translation fluency, proposing an entropy-based method to improve this by analyzing the distribution of matched words. Experiments show that combining this method with BLEU and METEOR improves their correlation scores on sentence-level evaluations in WMT 2010 and 2012 datasets.

The widely-used automatic evaluation metrics cannot adequately reflect the fluency of the translations. The n-gram-based metrics, like BLEU, limit the maximum length of matched fragments to n and cannot catch the matched fragments longer than n, so they can only reflect the fluency indirectly. METEOR, which is not limited by n-gram, uses the number of matched chunks but it does not consider the length of each chunk. In this paper, we propose an entropy-based method, which can sufficiently reflect the fluency of translations through the distribution of matched words. This method can easily combine with the widely-used automatic evaluation metrics to improve the evaluation of fluency. Experiments show that the correlations of BLEU and METEOR are improved on sentence level after combining with the entropy-based method on WMT 2010 and WMT 2012.

View on arXiv PDF

Similar