CL AIFeb 21, 2024

DREsS: Dataset for Rubric-based Essay Scoring on EFL Writing

Haneul Yoo, Jieun Han, So-Yeon Ahn, Alice Oh

arXiv:2402.16733v37.214 citationsh-index: 13ACL

Originality Synthesis-oriented

AI Analysis

This addresses a practical problem for EFL educators and students by enabling more accurate and rubric-based automated essay scoring, though it is incremental as it builds on existing dataset standardization and augmentation methods.

The paper tackles the lack of appropriate datasets for automated essay scoring in English as a Foreign Language education by releasing DREsS, a large-scale dataset with 48.9K samples, including a corruption-based augmentation strategy that improves baseline results by 45.44%.

Automated essay scoring (AES) is a useful tool in English as a Foreign Language (EFL) writing education, offering real-time essay scores for students and instructors. However, previous AES models were trained on essays and scores irrelevant to the practical scenarios of EFL writing education and usually provided a single holistic score due to the lack of appropriate datasets. In this paper, we release DREsS, a large-scale, standard dataset for rubric-based automated essay scoring with 48.9K samples in total. DREsS comprises three sub-datasets: DREsS_New, DREsS_Std., and DREsS_CASE. We collect DREsS_New, a real-classroom dataset with 2.3K essays authored by EFL undergraduate students and scored by English education experts. We also standardize existing rubric-based essay scoring datasets as DREsS_Std. We suggest CASE, a corruption-based augmentation strategy for essays, which generates 40.1K synthetic samples of DREsS_CASE and improves the baseline results by 45.44%. DREsS will enable further research to provide a more accurate and practical AES system for EFL writing education.

View on arXiv PDF

Similar