CLApr 28, 2022

UniTE: Unified Translation Evaluation

Yu Wan, Dayiheng Liu, Baosong Yang, Haibo Zhang, Boxing Chen, Derek F. Wong, Lidia S. Chao

arXiv:2204.13346v132.3648 citationsh-index: 31Has Code

Originality Highly original

AI Analysis

This work addresses the inconvenience and lack of commonality in existing translation evaluation methods for researchers and practitioners in machine translation, representing a novel integration rather than an incremental improvement.

The paper tackles the problem of evaluating machine translation quality across three distinct tasks (reference-only, source-only, and source-reference-combined) by proposing UniTE, a unified framework that uses monotonic regional attention and unified pretraining, achieving state-of-the-art results on WMT 2019 Metrics and WMT 2020 Quality Estimation benchmarks with a single model.

Translation quality evaluation plays a crucial role in machine translation. According to the input format, it is mainly separated into three tasks, i.e., reference-only, source-only and source-reference-combined. Recent methods, despite their promising results, are specifically designed and optimized on one of them. This limits the convenience of these methods, and overlooks the commonalities among tasks. In this paper, we propose UniTE, which is the first unified framework engaged with abilities to handle all three evaluation tasks. Concretely, we propose monotonic regional attention to control the interaction among input segments, and unified pretraining to better adapt multi-task learning. We testify our framework on WMT 2019 Metrics and WMT 2020 Quality Estimation benchmarks. Extensive analyses show that our \textit{single model} can universally surpass various state-of-the-art or winner methods across tasks. Both source code and associated models are available at https://github.com/NLP2CT/UniTE.

View on arXiv PDF Code

Similar