CL CY LGJun 19, 2025

Operationalizing Automated Essay Scoring: A Human-Aware Approach

arXiv:2506.21603v24 citationsh-index: 3

Originality Synthesis-oriented

AI Analysis

This work addresses the need for more reliable and trustworthy AES methods for educational or assessment contexts, but it is incremental as it compares existing methods without introducing a new paradigm.

This paper tackled the problem of operationalizing Automated Essay Scoring (AES) systems by comparing machine learning-based approaches with Large Language Models (LLMs) across dimensions like bias, robustness, and explainability, finding that ML-based models outperform LLMs in accuracy but struggle with explainability, while both approaches have issues with bias and robustness to edge scores.

This paper explores the human-centric operationalization of Automated Essay Scoring (AES) systems, addressing aspects beyond accuracy. We compare various machine learning-based approaches with Large Language Models (LLMs) approaches, identifying their strengths, similarities and differences. The study investigates key dimensions such as bias, robustness, and explainability, considered important for human-aware operationalization of AES systems. Our study shows that ML-based AES models outperform LLMs in accuracy but struggle with explainability, whereas LLMs provide richer explanations. We also found that both approaches struggle with bias and robustness to edge scores. By analyzing these dimensions, the paper aims to identify challenges and trade-offs between different methods, contributing to more reliable and trustworthy AES methods.

View on arXiv PDF

Similar