CLAug 4, 2023

Learning to Paraphrase Sentences to Different Complexity Levels

arXiv:2308.02226v1138 citationsh-index: 20Has Code
Originality Incremental advance
AI Analysis

This work addresses the need for versatile sentence paraphrasing models in NLP, though it is incremental as it builds on existing simplification tasks.

The paper tackled the problem of training models for sentence simplification, complexification, and same-level paraphrasing by introducing two new unsupervised datasets and comparing them with a supervised dataset, achieving state-of-the-art performance on the ASSET simplification benchmark and outperforming previous work on sentence-level targeting.

While sentence simplification is an active research topic in NLP, its adjacent tasks of sentence complexification and same-level paraphrasing are not. To train models on all three tasks, we present two new unsupervised datasets. We compare these datasets, one labeled by a weak classifier and the other by a rule-based approach, with a single supervised dataset. Using these three datasets for training, we perform extensive experiments on both multitasking and prompting strategies. Compared to other systems trained on unsupervised parallel data, models trained on our weak classifier labeled dataset achieve state-of-the-art performance on the ASSET simplification benchmark. Our models also outperform previous work on sentence level targeting. Finally, we establish how a handful of Large Language Models perform on these tasks under a zero-shot setting.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes