Exemplar-Controllable Paraphrasing and Translation using Bitext
This work addresses the need for more efficient and versatile natural language processing tools for multilingual tasks, though it is incremental as it adapts prior models to use bitext instead of paraphrase data.
The authors tackled the problem of generating paraphrases and translations with syntactic control without needing costly paraphrase datasets, by training a single model on bilingual text (bitext) in near zero-shot conditions. The model achieved competitive results on controlled paraphrase generation and strong performance on controlled machine translation, as shown on three novel evaluation datasets.
Most prior work on exemplar-based syntactically controlled paraphrase generation relies on automatically-constructed large-scale paraphrase datasets, which are costly to create. We sidestep this prerequisite by adapting models from prior work to be able to learn solely from bilingual text (bitext). Despite only using bitext for training, and in near zero-shot conditions, our single proposed model can perform four tasks: controlled paraphrase generation in both languages and controlled machine translation in both language directions. To evaluate these tasks quantitatively, we create three novel evaluation datasets. Our experimental results show that our models achieve competitive results on controlled paraphrase generation and strong performance on controlled machine translation. Analysis shows that our models learn to disentangle semantics and syntax in their latent representations, but still suffer from semantic drift.