Incorporate Semantic Structures into Machine Translation Evaluation via UCCA
This work addresses machine translation evaluation by incorporating semantic structures, offering an incremental improvement over current methods.
The paper tackles the problem of evaluating machine translation by identifying semantic core words using UCCA and weighting sentence similarity based on their overlap, resulting in consistent performance improvements for existing lexical similarity metrics.
Copying mechanism has been commonly used in neural paraphrasing networks and other text generation tasks, in which some important words in the input sequence are preserved in the output sequence. Similarly, in machine translation, we notice that there are certain words or phrases appearing in all good translations of one source text, and these words tend to convey important semantic information. Therefore, in this work, we define words carrying important semantic meanings in sentences as semantic core words. Moreover, we propose an MT evaluation approach named Semantically Weighted Sentence Similarity (SWSS). It leverages the power of UCCA to identify semantic core words, and then calculates sentence similarity scores on the overlap of semantic core words. Experimental results show that SWSS can consistently improve the performance of popular MT evaluation metrics which are based on lexical similarity.