CLAug 11, 2020

Paraphrase Generation as Zero-Shot Multilingual Translation: Disentangling Semantic Similarity from Lexical and Syntactic Diversity

arXiv:2008.04935v2999 citations
AI Analysis

This addresses the challenge of generating high-quality, controllable paraphrases for NLP applications, offering a multilingual solution, though it is incremental as it builds on existing multilingual NMT models.

The paper tackled the problem of generating diverse and meaningful paraphrases across multiple languages by introducing a simple algorithm that discourages n-gram overlap with the input, enabling control over lexical diversity. The result showed that this method outperformed a state-of-the-art English paraphraser in preserving meaning and grammaticality for the same diversity level, with human evaluations confirming effectiveness in non-English languages.

Recent work has shown that a multilingual neural machine translation (NMT) model can be used to judge how well a sentence paraphrases another sentence in the same language (Thompson and Post, 2020); however, attempting to generate paraphrases from such a model using standard beam search produces trivial copies or near copies. We introduce a simple paraphrase generation algorithm which discourages the production of n-grams that are present in the input. Our approach enables paraphrase generation in many languages from a single multilingual NMT model. Furthermore, the amount of lexical diversity between the input and output can be controlled at generation time. We conduct a human evaluation to compare our method to a paraphraser trained on the large English synthetic paraphrase database ParaBank 2 (Hu et al., 2019c) and find that our method produces paraphrases that better preserve meaning and are more gramatical, for the same level of lexical diversity. Additional smaller human assessments demonstrate our approach also works in two non-English languages.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes