CY LGSep 29, 2025

Personalized Auto-Grading and Feedback System for Constructive Geometry Tasks Using Large Language Models on an Online Math Platform

Yong Oh Lee, Byeonghun Bang, Joohyun Lee, Sejun Oh

arXiv:2509.25529v11 citationsh-index: 1IEEE Access

Originality Synthesis-oriented

AI Analysis

This addresses the need for scalable, teacher-aligned formative assessment in mathematics education, though it is incremental as it applies existing LLMs to a specific domain.

The study tackled the problem of assessing complex student responses in geometry by developing a personalized auto-grading and feedback system using GPT-4, which closely aligned with teacher judgments and helped students revise errors in a pilot with 79 middle-school students.

As personalized learning gains increasing attention in mathematics education, there is a growing demand for intelligent systems that can assess complex student responses and provide individualized feedback in real time. In this study, we present a personalized auto-grading and feedback system for constructive geometry tasks, developed using large language models (LLMs) and deployed on the Algeomath platform, a Korean online tool designed for interactive geometric constructions. The proposed system evaluates student-submitted geometric constructions by analyzing their procedural accuracy and conceptual understanding. It employs a prompt-based grading mechanism using GPT-4, where student answers and model solutions are compared through a few-shot learning approach. Feedback is generated based on teacher-authored examples built from anticipated student responses, and it dynamically adapts to the student's problem-solving history, allowing up to four iterative attempts per question. The system was piloted with 79 middle-school students, where LLM-generated grades and feedback were benchmarked against teacher judgments. Grading closely aligned with teachers, and feedback helped many students revise errors and complete multi-step geometry tasks. While short-term corrections were frequent, longer-term transfer effects were less clear. Overall, the study highlights the potential of LLMs to support scalable, teacher-aligned formative assessment in mathematics, while pointing to improvements needed in terminology handling and feedback design.

View on arXiv PDF

Similar