CLOct 4, 2017

Discourse Structure in Machine Translation Evaluation

Shafiq Joty, Francisco Guzmán, Lluís Màrquez, Preslav Nakov

arXiv:1710.01504v139.31099 citationsh-index: 64

Originality Incremental advance

AI Analysis

This work addresses the need for more accurate evaluation metrics in machine translation, offering incremental improvements by integrating discourse information.

The paper tackled the problem of machine translation evaluation by incorporating sentence-level discourse structure, showing that discourse-aware similarity measures improve existing metrics' correlation with human judgments at both segment- and system-level, with specific gains such as enhancing metrics like DiscoTKparty.

In this article, we explore the potential of using sentence-level discourse structure for machine translation evaluation. We first design discourse-aware similarity measures, which use all-subtree kernels to compare discourse parse trees in accordance with the Rhetorical Structure Theory (RST). Then, we show that a simple linear combination with these measures can help improve various existing machine translation evaluation metrics regarding correlation with human judgments both at the segment- and at the system-level. This suggests that discourse information is complementary to the information used by many of the existing evaluation metrics, and thus it could be taken into account when developing richer evaluation metrics, such as the WMT-14 winning combined metric DiscoTKparty. We also provide a detailed analysis of the relevance of various discourse elements and relations from the RST parse trees for machine translation evaluation. In particular we show that: (i) all aspects of the RST tree are relevant, (ii) nuclearity is more useful than relation type, and (iii) the similarity of the translation RST tree to the reference tree is positively correlated with translation quality.

View on arXiv PDF

Similar