CLMar 17

Exploiting the English Grammar Profile for L2 grammatical analysis with LLMs

Stefano Bannò, Penny Karanasou, Kate Knill, Mark Gales

arXiv:2603.1717111.2h-index: 11

AI Analysis

This work addresses the need for targeted feedback and assessment in second language learning, though it is incremental as it builds on existing taxonomies and methods.

The paper tackles the problem of evaluating grammatical competence in second language learners by proposing a framework that uses the English Grammar Profile to detect and classify grammatical constructs as successful or unsuccessful, enabling fine-grained feedback and proficiency assessment. The results show that LLMs outperform rule-based methods for nuanced constructs, while a hybrid approach yields the strongest performance for proficiency assessment, with an automated pipeline closely matching semi-automated systems.

Evaluating the grammatical competence of second language (L2) learners is essential both for providing targeted feedback and for assessing proficiency. To achieve this, we propose a novel framework leveraging the English Grammar Profile (EGP), a taxonomy of grammatical constructs mapped to the proficiency levels of the Common European Framework of Reference (CEFR), to detect learners' attempts at grammatical constructs and classify them as successful or unsuccessful. This detection can then be used to provide fine-grained feedback. Moreover, the grammatical constructs are used as predictors of proficiency assessment by using automatically detected attempts as predictors of holistic CEFR proficiency. For the selection of grammatical constructs derived from the EGP, rule-based and LLM-based classifiers are compared. We show that LLMs outperform rule-based methods on semantically and pragmatically nuanced constructs, while rule-based approaches remain competitive for constructs that rely purely on morphological or syntactic features and do not require semantic interpretation. For proficiency assessment, we evaluate both rule-based and hybrid pipelines and show that a hybrid approach combining a rule-based pre-filter with an LLM consistently yields the strongest performance. Since our framework operates on pairs of original learner sentences and their corrected counterparts, we also evaluate a fully automated pipeline using automatic grammatical error correction. This pipeline closely approaches the performance of semi-automated systems based on manual corrections, particularly for the detection of successful attempts at grammatical constructs. Overall, our framework emphasises learners' successful attempts in addition to unsuccessful ones, enabling positive, formative feedback and providing actionable insights into grammatical development.

View on arXiv PDF

Similar