ParaRev: Building a dataset for Scientific Paragraph Revision annotated with revision instruction

Bibliographic Details
Title: ParaRev: Building a dataset for Scientific Paragraph Revision annotated with revision instruction
Authors: Jourdan, Léane, Hernandez, Nicolas, Dufour, Richard, Boudin, Florian, Aizawa, Akiko
Publication Year: 2025
Collection: Computer Science
Subject Terms: Computer Science - Computation and Language
More Details: Revision is a crucial step in scientific writing, where authors refine their work to improve clarity, structure, and academic quality. Existing approaches to automated writing assistance often focus on sentence-level revisions, which fail to capture the broader context needed for effective modification. In this paper, we explore the impact of shifting from sentence-level to paragraph-level scope for the task of scientific text revision. The paragraph level definition of the task allows for more meaningful changes, and is guided by detailed revision instructions rather than general ones. To support this task, we introduce ParaRev, the first dataset of revised scientific paragraphs with an evaluation subset manually annotated with revision instructions. Our experiments demonstrate that using detailed instructions significantly improves the quality of automated revisions compared to general approaches, no matter the model or the metric considered.
Comment: Accepted at the WRAICogs 1 workoshop (co-located with Coling 2025)
Document Type: Working Paper
Access URL: http://arxiv.org/abs/2501.05222
Accession Number: edsarx.2501.05222
Database: arXiv
More Details
Description not available.