CLOct 24, 2022

Bilingual Synchronization: Restoring Translational Relationships with Editing Operations

arXiv:2210.13163v1292 citationsh-index: 35
Originality Incremental advance
AI Analysis

This addresses the need for more flexible and efficient machine translation workflows, particularly in interactive and memory-aided settings, though it is incremental in building on existing edit-based approaches.

The paper tackles the problem of bilingual synchronization, where an initial target sequence must be edited to become a valid translation of a source text, and finds that a single generic edit-based system can match or outperform dedicated systems for tasks like interactive MT and translation memory cleaning.

Machine Translation (MT) is usually viewed as a one-shot process that generates the target language equivalent of some source text from scratch. We consider here a more general setting which assumes an initial target sequence, that must be transformed into a valid translation of the source, thereby restoring parallelism between source and target. For this bilingual synchronization task, we consider several architectures (both autoregressive and non-autoregressive) and training regimes, and experiment with multiple practical settings such as simulated interactive MT, translating with Translation Memory (TM) and TM cleaning. Our results suggest that one single generic edit-based system, once fine-tuned, can compare with, or even outperform, dedicated systems specifically trained for these tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes