CLNov 20, 2019

Controlling Neural Machine Translation Formality with Synthetic Supervision

arXiv:1911.08706v239 citations
Originality Incremental advance
AI Analysis

This work addresses the need for audience-appropriate translations in machine translation, though it is incremental as it builds on existing multi-task models with a novel training approach.

The paper tackled the problem of controlling formality in neural machine translation by introducing a training scheme that generates synthetic triplets to address the lack of labeled bilingual data, resulting in a model that outperforms existing ones in matching desired formality levels while preserving meaning.

This work aims to produce translations that convey source language content at a formality level that is appropriate for a particular audience. Framing this problem as a neural sequence-to-sequence task ideally requires training triplets consisting of a bilingual sentence pair labeled with target language formality. However, in practice, available training examples are limited to English sentence pairs of different styles, and bilingual parallel sentences of unknown formality. We introduce a novel training scheme for multi-task models that automatically generates synthetic training triplets by inferring the missing element on the fly, thus enabling end-to-end training. Comprehensive automatic and human assessments show that our best model outperforms existing models by producing translations that better match desired formality levels while preserving the source meaning.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes