Improve LLM-based Automatic Essay Scoring with Linguistic Features
This work aims to improve the accuracy and efficiency of automatic essay scoring for educators, particularly for essays from diverse prompts, by enhancing LLM-based methods.
This paper addresses the challenge of Automatic Essay Scoring (AES) across diverse prompts by combining linguistic features with LLM-based scoring. The hybrid method demonstrated superior performance compared to baseline models for both in-domain and out-of-domain writing prompts.
Automatic Essay Scoring (AES) assigns scores to student essays, reducing the grading workload for instructors. Developing a scoring system capable of handling essays across diverse prompts is challenging due to the flexibility and diverse nature of the writing task. Existing methods typically fall into two categories: supervised feature-based approaches and large language model (LLM)-based methods. Supervised feature-based approaches often achieve higher performance but require resource-intensive training. In contrast, LLM-based methods are computationally efficient during inference but tend to suffer from lower performance. This paper combines these approaches by incorporating linguistic features into LLM-based scoring. Experimental results show that this hybrid method outperforms baseline models for both in-domain and out-of-domain writing prompts.