LGAIJul 3, 2025

Automated Grading of Students' Handwritten Graphs: A Comparison of Meta-Learning and Vision-Large Language Models

arXiv:2507.03056v11 citationsh-index: 17
Originality Incremental advance
AI Analysis

This addresses the problem of efficient and consistent assessment in online mathematics education for STEM students, but it is incremental as it builds on existing autograding research with a focus on handwritten graphs.

The study tackled autograding students' handwritten graphs in STEM education by comparing meta-learning models and Vision Large Language Models (VLLMs), finding that meta-learning models outperformed VLLMs in 2-way classification tasks, while VLLMs slightly outperformed in 3-way classification tasks.

With the rise of online learning, the demand for efficient and consistent assessment in mathematics has significantly increased over the past decade. Machine Learning (ML), particularly Natural Language Processing (NLP), has been widely used for autograding student responses, particularly those involving text and/or mathematical expressions. However, there has been limited research on autograding responses involving students' handwritten graphs, despite their prevalence in Science, Technology, Engineering, and Mathematics (STEM) curricula. In this study, we implement multimodal meta-learning models for autograding images containing students' handwritten graphs and text. We further compare the performance of Vision Large Language Models (VLLMs) with these specially trained metalearning models. Our results, evaluated on a real-world dataset collected from our institution, show that the best-performing meta-learning models outperform VLLMs in 2-way classification tasks. In contrast, in more complex 3-way classification tasks, the best-performing VLLMs slightly outperform the meta-learning models. While VLLMs show promising results, their reliability and practical applicability remain uncertain and require further investigation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes