CLAug 19, 2019

Polly Want a Cracker: Analyzing Performance of Parroting on Paraphrase Generation Datasets

arXiv:1908.07831v11000 citations

Originality Synthesis-oriented

AI Analysis

This reveals a critical flaw in current paraphrase generation evaluation for NLP researchers, making it incremental by exposing dataset issues rather than advancing methods.

The paper analyzed paraphrase generation datasets and found that simply repeating input sentences (parroting) outperformed state-of-the-art models on standard metrics, showing models can appear adept while making trivial or no changes.

Paraphrase generation is an interesting and challenging NLP task which has numerous practical applications. In this paper, we analyze datasets commonly used for paraphrase generation research, and show that simply parroting input sentences surpasses state-of-the-art models in the literature when evaluated on standard metrics. Our findings illustrate that a model could be seemingly adept at generating paraphrases, despite only making trivial changes to the input sentence or even none at all.

View on arXiv PDF

Similar