Don't Change Me! User-Controllable Selective Paraphrase Generation
This addresses the need for user-controllable paraphrase generation in applications where specific text must remain unchanged, though it is incremental as it builds on existing pretrained models.
The paper tackles the problem of generating paraphrases while preserving user-specified phrases that should not be altered, by introducing a method that allows users to tag text segments as 'don't change me!' and training a model to copy these phrases explicitly, with experiments showing results in English and Chinese.
In the paraphrase generation task, source sentences often contain phrases that should not be altered. Which phrases, however, can be context dependent and can vary by application. Our solution to this challenge is to provide the user with explicit tags that can be placed around any arbitrary segment of text to mean "don't change me!" when generating a paraphrase; the model learns to explicitly copy these phrases to the output. The contribution of this work is a novel data generation technique using distant supervision that allows us to start with a pretrained sequence-to-sequence model and fine-tune a paraphrase generator that exhibits this behavior, allowing user-controllable paraphrase generation. Additionally, we modify the loss during fine-tuning to explicitly encourage diversity in model output. Our technique is language agnostic, and we report experiments in English and Chinese.