A Quality-based Syntactic Template Retriever for Syntactically-controlled Paraphrase Generation
This addresses a bottleneck in practical applications of syntactically-controlled paraphrase generation for natural language processing tasks, though it is incremental as it builds on existing SPG models.
The paper tackles the difficulty of obtaining reliable syntactic templates for syntactically-controlled paraphrase generation by proposing a quality-based retriever (QSTR) and a diverse templates search algorithm (DTS), resulting in significant improvements over existing methods and performance comparable to human-annotated templates in reference-free metrics.
Existing syntactically-controlled paraphrase generation (SPG) models perform promisingly with human-annotated or well-chosen syntactic templates. However, the difficulty of obtaining such templates actually hinders the practical application of SPG models. For one thing, the prohibitive cost makes it unfeasible to manually design decent templates for every source sentence. For another, the templates automatically retrieved by current heuristic methods are usually unreliable for SPG models to generate qualified paraphrases. To escape this dilemma, we propose a novel Quality-based Syntactic Template Retriever (QSTR) to retrieve templates based on the quality of the to-be-generated paraphrases. Furthermore, for situations requiring multiple paraphrases for each source sentence, we design a Diverse Templates Search (DTS) algorithm, which can enhance the diversity between paraphrases without sacrificing quality. Experiments demonstrate that QSTR can significantly surpass existing retrieval methods in generating high-quality paraphrases and even perform comparably with human-annotated templates in terms of reference-free metrics. Additionally, human evaluation and the performance on downstream tasks using our generated paraphrases for data augmentation showcase the potential of our QSTR and DTS algorithm in practical scenarios.