CLOct 24, 2022

Bilingual Synchronization: Restoring Translational Relationships with Editing Operations

arXiv:2210.13163v124.0292 citationsh-index: 35

Originality Incremental advance

AI Analysis

This addresses the need for more flexible and efficient machine translation workflows, particularly in interactive and memory-aided settings, though it is incremental in building on existing edit-based approaches.

The paper tackles the problem of bilingual synchronization, where an initial target sequence must be edited to become a valid translation of a source text, and finds that a single generic edit-based system can match or outperform dedicated systems for tasks like interactive MT and translation memory cleaning.

Machine Translation (MT) is usually viewed as a one-shot process that generates the target language equivalent of some source text from scratch. We consider here a more general setting which assumes an initial target sequence, that must be transformed into a valid translation of the source, thereby restoring parallelism between source and target. For this bilingual synchronization task, we consider several architectures (both autoregressive and non-autoregressive) and training regimes, and experiment with multiple practical settings such as simulated interactive MT, translating with Translation Memory (TM) and TM cleaning. Our results suggest that one single generic edit-based system, once fine-tuned, can compare with, or even outperform, dedicated systems specifically trained for these tasks.

View on arXiv PDF

Similar