CL AIMay 2, 2025

Large Language Model-Driven Dynamic Assessment of Grammatical Accuracy in English Language Learner Writing

Timur Jaganov, John Blake, Julián Villegas, Nicholas Carr

arXiv:2505.00931v12.71 citationsh-index: 1IEEE Access

Originality Synthesis-oriented

AI Analysis

This enables dynamic assessment to reach larger groups in language learning classrooms, though it is incremental as it applies existing LLMs to a specific educational domain.

The study tackled scaling Dynamic Assessment (DA) for English language learners by developing DynaWrite, a tutoring app using Large Language Models (LLMs) to provide feedback, finding that GPT-4o outperformed neural chat in generating clear and progressive hints while maintaining real-time responsiveness.

This study investigates the potential for Large Language Models (LLMs) to scale-up Dynamic Assessment (DA). To facilitate such an investigation, we first developed DynaWrite-a modular, microservices-based grammatical tutoring application which supports multiple LLMs to generate dynamic feedback to learners of English. Initial testing of 21 LLMs, revealed GPT-4o and neural chat to have the most potential to scale-up DA in the language learning classroom. Further testing of these two candidates found both models performed similarly in their ability to accurately identify grammatical errors in user sentences. However, GPT-4o consistently outperformed neural chat in the quality of its DA by generating clear, consistent, and progressively explicit hints. Real-time responsiveness and system stability were also confirmed through detailed performance testing, with GPT-4o exhibiting sufficient speed and stability. This study shows that LLMs can be used to scale-up dynamic assessment and thus enable dynamic assessment to be delivered to larger groups than possible in traditional teacher-learner settings.

View on arXiv PDF

Similar