Extending Word-Level Quality Estimation for Post-Editing Assistance
This is an incremental improvement for machine translation post-editing efficiency, targeting translators and editors.
The paper tackles the problem of improving post-editing assistance in machine translation by proposing a refined word-level quality estimation task that directly indicates editing operations, based on extended word alignment extracted using mBERT. Experiments on two language pairs demonstrate the method's feasibility, though no concrete performance numbers are provided.
We define a novel concept called extended word alignment in order to improve post-editing assistance efficiency. Based on extended word alignment, we further propose a novel task called refined word-level QE that outputs refined tags and word-level correspondences. Compared to original word-level QE, the new task is able to directly point out editing operations, thus improves efficiency. To extract extended word alignment, we adopt a supervised method based on mBERT. To solve refined word-level QE, we firstly predict original QE tags by training a regression model for sequence tagging based on mBERT and XLM-R. Then, we refine original word tags with extended word alignment. In addition, we extract source-gap correspondences, meanwhile, obtaining gap tags. Experiments on two language pairs show the feasibility of our method and give us inspirations for further improvement.