CL IRNov 28, 2016

Joint Copying and Restricted Generation for Paraphrase

Ziqiang Cao, Chuwei Luo, Wenjie Li, Sujian Li

arXiv:1611.09235v112.790 citations

Originality Incremental advance

AI Analysis

This addresses the need for improved paraphrase generation in NLP applications, offering an incremental advance by explicitly modeling copying and rewriting modes.

The paper tackles the problem of generating paraphrases in tasks like summarization and simplification by proposing a Seq2Seq model that fuses copying and restricted generation decoders, with results showing it outperforms state-of-the-art methods in informativeness and language quality on two datasets.

Many natural language generation tasks, such as abstractive summarization and text simplification, are paraphrase-orientated. In these tasks, copying and rewriting are two main writing modes. Most previous sequence-to-sequence (Seq2Seq) models use a single decoder and neglect this fact. In this paper, we develop a novel Seq2Seq model to fuse a copying decoder and a restricted generative decoder. The copying decoder finds the position to be copied based on a typical attention model. The generative decoder produces words limited in the source-specific vocabulary. To combine the two decoders and determine the final output, we develop a predictor to predict the mode of copying or rewriting. This predictor can be guided by the actual writing mode in the training data. We conduct extensive experiments on two different paraphrase datasets. The result shows that our model outperforms the state-of-the-art approaches in terms of both informativeness and language quality.

View on arXiv PDF

Similar