CLNov 2, 2024

NLP and Education: using semantic similarity to evaluate filled gaps in a large-scale Cloze test in the classroom

Túlio Sousa de Gois, Flávia Oliveira Freitas, Julian Tejada, Raquel Meister Ko. Freitag

arXiv:2411.01280v11.92 citationsh-index: 11Ment Lex

Originality Synthesis-oriented

AI Analysis

It offers a more efficient method for evaluating reading proficiency in educational settings, though it is incremental as it applies existing NLP techniques to a specific domain.

This study tackled the challenge of automating large-scale Cloze test corrections by using word embeddings to assess semantic similarity between expected and student answers in Brazilian Portuguese, finding that GloVe achieved the highest correlation with human judges.

This study examines the applicability of the Cloze test, a widely used tool for assessing text comprehension proficiency, while highlighting its challenges in large-scale implementation. To address these limitations, an automated correction approach was proposed, utilizing Natural Language Processing (NLP) techniques, particularly word embeddings (WE) models, to assess semantic similarity between expected and provided answers. Using data from Cloze tests administered to students in Brazil, WE models for Brazilian Portuguese (PT-BR) were employed to measure the semantic similarity of the responses. The results were validated through an experimental setup involving twelve judges who classified the students' answers. A comparative analysis between the WE models' scores and the judges' evaluations revealed that GloVe was the most effective model, demonstrating the highest correlation with the judges' assessments. This study underscores the utility of WE models in evaluating semantic similarity and their potential to enhance large-scale Cloze test assessments. Furthermore, it contributes to educational assessment methodologies by offering a more efficient approach to evaluating reading proficiency.

View on arXiv PDF

Similar