CLSep 9, 2019

Unsupervised Paraphrasing by Simulated Annealing

arXiv:1909.03588v21025 citations
AI Analysis

This addresses the problem of generating paraphrases without parallel data for NLP applications, offering a generalizable solution, though it is incremental as it builds on optimization methods.

The paper tackles unsupervised paraphrase generation by proposing UPSA, which models it as an optimization problem using simulated annealing, and achieves state-of-the-art performance on benchmarks like Quora and MSCOCO, outperforming most supervised models.

Unsupervised paraphrase generation is a promising and important research topic in natural language processing. We propose UPSA, a novel approach that accomplishes Unsupervised Paraphrasing by Simulated Annealing. We model paraphrase generation as an optimization problem and propose a sophisticated objective function, involving semantic similarity, expression diversity, and language fluency of paraphrases. Then, UPSA searches the sentence space towards this objective by performing a sequence of local editing. Our method is unsupervised and does not require parallel corpora for training, so it could be easily applied to different domains. We evaluate our approach on a variety of benchmark datasets, namely, Quora, Wikianswers, MSCOCO, and Twitter. Extensive results show that UPSA achieves the state-of-the-art performance compared with previous unsupervised methods in terms of both automatic and human evaluations. Further, our approach outperforms most existing domain-adapted supervised models, showing the generalizability of UPSA.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes