CL LGSep 30, 2019

Towards Diverse Paraphrase Generation Using Multi-Class Wasserstein GAN

arXiv:1909.13827v10.52 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of producing varied and high-quality paraphrases for natural language processing applications, representing an incremental improvement over existing methods.

The paper tackled the problem of generating diverse paraphrases in NLP by proposing a multi-class Wasserstein GAN model that conditions on pattern embeddings, resulting in state-of-the-art performance with improved fluency and diversity as validated by automatic metrics and human evaluation.

Paraphrase generation is an important and challenging natural language processing (NLP) task. In this work, we propose a deep generative model to generate paraphrase with diversity. Our model is based on an encoder-decoder architecture. An additional transcoder is used to convert a sentence into its paraphrasing latent code. The transcoder takes an explicit pattern embedding variable as condition, so diverse paraphrase can be generated by sampling on the pattern embedding variable. We use a Wasserstein GAN to align the distributions of the real and generated paraphrase samples. We propose a multi-class extension to the Wasserstein GAN, which allows our generative model to learn from both positive and negative samples. The generated paraphrase distribution is forced to get closer to the positive real distribution, and be pushed away from the negative distribution in Wasserstein distance. We test our model in two datasets with both automatic metrics and human evaluation. Results show that our model can generate fluent and reliable paraphrase samples that outperform the state-of-art results, while also provides reasonable variability and diversity.

View on arXiv PDF

Similar