CLMay 14, 2020

Parallel Data Augmentation for Formality Style Transfer

arXiv:2005.07522v11026 citations
AI Analysis

This work addresses a data bottleneck in formality style transfer for natural language processing applications, though it is incremental as it focuses on data augmentation rather than a new model.

The paper tackled the problem of insufficient training data for Formality Style Transfer by proposing simple data augmentation methods to generate useful parallel sentence pairs, achieving state-of-the-art results on the GYAFC benchmark dataset.

The main barrier to progress in the task of Formality Style Transfer is the inadequacy of training data. In this paper, we study how to augment parallel data and propose novel and simple data augmentation methods for this task to obtain useful sentence pairs with easily accessible models and systems. Experiments demonstrate that our augmented parallel data largely helps improve formality style transfer when it is used to pre-train the model, leading to the state-of-the-art results in the GYAFC benchmark dataset.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes