CYAISep 12, 2025

Assisting the Grading of a Handwritten General Chemistry Exam with Artificial Intelligence

arXiv:2509.10591v22 citationsh-index: 6
Originality Synthesis-oriented
AI Analysis

This addresses the problem of automating grading for educators in chemistry education, though it is incremental as it builds on existing AI grading methods with specific domain adaptations.

The study tackled the problem of grading handwritten general chemistry exams by developing an AI-based system, finding high agreement with human graders for textual and chemical reaction questions but lower reliability for numerical and graphical tasks, with results indicating promising applications for routine assessments.

We explore the effectiveness and reliability of an artificial intelligence (AI)-based grading system for a handwritten general chemistry exam, comparing AI-assigned scores to human grading across various types of questions. Exam pages and grading rubrics were uploaded as images to account for chemical reaction equations, short and long open-ended answers, numerical and symbolic answer derivations, drawing, and sketching in pencil-and-paper format. Using linear regression analyses and psychometric evaluations, the investigation reveals high agreement between AI and human graders for textual and chemical reaction questions, while highlighting lower reliability for numerical and graphical tasks. The findings emphasize the necessity for human oversight to ensure grading accuracy, based on selective filtering. The results indicate promising applications for AI in routine assessment tasks, though careful consideration must be given to student perceptions of fairness and trust in integrating AI-based grading into educational practice.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes