CLJun 13, 2024

Automated Essay Scoring Using Grammatical Variety and Errors with Multi-Task Learning and Item Response Theory

arXiv:2406.08817v127 citations
Originality Incremental advance
AI Analysis

This work addresses automated essay scoring for educational assessment, but it is incremental as it builds on existing methods with specific feature enhancements.

This study tackled automatic essay scoring by incorporating grammatical features, such as correct usage and error counts, into models, resulting in improved performance through multi-task learning and Item Response Theory-based weighting, with concrete gains in prediction accuracy.

This study examines the effect of grammatical features in automatic essay scoring (AES). We use two kinds of grammatical features as input to an AES model: (1) grammatical items that writers used correctly in essays, and (2) the number of grammatical errors. Experimental results show that grammatical features improve the performance of AES models that predict the holistic scores of essays. Multi-task learning with the holistic and grammar scores, alongside using grammatical features, resulted in a larger improvement in model performance. We also show that a model using grammar abilities estimated using Item Response Theory (IRT) as the labels for the auxiliary task achieved comparable performance to when we used grammar scores assigned by human raters. In addition, we weight the grammatical features using IRT to consider the difficulty of grammatical items and writers' grammar abilities. We found that weighting grammatical features with the difficulty led to further improvement in performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes