Paraphrasing via Ranking Many Candidates
This incremental approach improves paraphrase generation for NLP tasks, aiding in data augmentation and downstream performance.
The paper tackles the challenge of generating high-quality paraphrases by selecting the best candidate from multiple generated options, rather than relying on a single method, and demonstrates its effectiveness across various domains and languages like English and Korean, with performance comparable to previous methods.
We present a simple and effective way to generate a variety of paraphrases and find a good quality paraphrase among them. As in previous studies, it is difficult to ensure that one generation method always generates the best paraphrase in various domains. Therefore, we focus on finding the best candidate from multiple candidates, rather than assuming that there is only one combination of generative models and decoding options. Our approach shows that it is easy to apply in various domains and has sufficiently good performance compared to previous methods. In addition, our approach can be used for data augmentation that extends the downstream corpus, showing that it can help improve performance in English and Korean datasets.