CLAIJan 12, 2024

Human-AI Collaborative Essay Scoring: A Dual-Process Framework with LLMs

arXiv:2401.06431v291 citationsh-index: 5Has CodeLAK
Originality Incremental advance
AI Analysis

This work addresses the need for timely feedback in second-language learning by proposing a human-AI collaborative system, though it is incremental as it builds on existing LLM and grading methods.

This study tackled automated essay scoring by exploring the use of large language models (LLMs), finding that while they do not outperform conventional state-of-the-art models, they offer consistency, generalizability, and explainability, and a proposed open-source system enhances human grader performance and efficiency, particularly for low-confidence essays.

Receiving timely and personalized feedback is essential for second-language learners, especially when human instructors are unavailable. This study explores the effectiveness of Large Language Models (LLMs), including both proprietary and open-source models, for Automated Essay Scoring (AES). Through extensive experiments with public and private datasets, we find that while LLMs do not surpass conventional state-of-the-art (SOTA) grading models in performance, they exhibit notable consistency, generalizability, and explainability. We propose an open-source LLM-based AES system, inspired by the dual-process theory. Our system offers accurate grading and high-quality feedback, at least comparable to that of fine-tuned proprietary LLMs, in addition to its ability to alleviate misgrading. Furthermore, we conduct human-AI co-grading experiments with both novice and expert graders. We find that our system not only automates the grading process but also enhances the performance and efficiency of human graders, particularly for essays where the model has lower confidence. These results highlight the potential of LLMs to facilitate effective human-AI collaboration in the educational context, potentially transforming learning experiences through AI-generated feedback.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes