Algebra Error Classification with Large Language Models
This work addresses the need for more flexible and generalizable error classification in educational technology, though it is incremental as it builds on existing data-driven approaches.
The paper tackled the problem of classifying student errors in algebra for automated feedback systems by introducing a method using pre-trained large language models, which outperformed existing methods and could classify a larger set of responses.
Automated feedback as students answer open-ended math questions has significant potential in improving learning outcomes at large scale. A key part of automated feedback systems is an error classification component, which identifies student errors and enables appropriate, predefined feedback to be deployed. Most existing approaches to error classification use a rule-based method, which has limited capacity to generalize. Existing data-driven methods avoid these limitations but specifically require mathematical expressions in student responses to be parsed into syntax trees. This requirement is itself a limitation, since student responses are not always syntactically valid and cannot be converted into trees. In this work, we introduce a flexible method for error classification using pre-trained large language models. We demonstrate that our method can outperform existing methods in algebra error classification, and is able to classify a larger set of student responses. Additionally, we analyze common classification errors made by our method and discuss limitations of automated error classification.