ParaRev: Building a dataset for Scientific Paragraph Revision annotated with revision instruction
This work addresses the need for better automated writing assistance in scientific contexts by providing a dataset and method for paragraph-level revisions, though it is incremental in building upon existing sentence-level approaches.
The authors tackled the problem of automated scientific writing revision by shifting from sentence-level to paragraph-level scope, introducing the ParaRev dataset with detailed revision instructions, and demonstrated that detailed instructions significantly improve revision quality across models and metrics.
Revision is a crucial step in scientific writing, where authors refine their work to improve clarity, structure, and academic quality. Existing approaches to automated writing assistance often focus on sentence-level revisions, which fail to capture the broader context needed for effective modification. In this paper, we explore the impact of shifting from sentence-level to paragraph-level scope for the task of scientific text revision. The paragraph level definition of the task allows for more meaningful changes, and is guided by detailed revision instructions rather than general ones. To support this task, we introduce ParaRev, the first dataset of revised scientific paragraphs with an evaluation subset manually annotated with revision instructions. Our experiments demonstrate that using detailed instructions significantly improves the quality of automated revisions compared to general approaches, no matter the model or the metric considered.