Analyzing Examinee Comments using DistilBERT and Machine Learning to Ensure Quality Control in Exam Content
This offers testing organizations an incremental improvement for quality assurance by incorporating direct candidate experience more efficiently.
The study tackled the problem of identifying problematic test items by analyzing candidate comments using NLP and machine learning, finding that candidate feedback provides valuable complementary information to statistical methods that could improve test validity and reduce manual review burden.
This study explores using Natural Language Processing (NLP) to analyze candidate comments for identifying problematic test items. We developed and validated machine learning models that automatically identify relevant negative feedback, evaluated approaches of incorporating psychometric features enhances model performance, and compared NLP-flagged items with traditionally flagged items. Results demonstrate that candidate feedback provides valuable complementary information to statistical methods, potentially improving test validity while reducing manual review burden. This research offers testing organizations an efficient mechanism to incorporate direct candidate experience into quality assurance processes.