AI CLFeb 9

The Use of AI Tools to Develop and Validate Q-Matrices

Kevin Fan, Jacquelyn A. Bialo, Hongli Li

arXiv:2602.08796v1h-index: 4

Originality Incremental advance

AI Analysis

This addresses the labor-intensive process of Q-matrix construction in cognitive diagnostic modeling, though it appears incremental as it applies existing AI tools to a known bottleneck.

This study investigated whether AI tools could support Q-matrix development for cognitive diagnostic modeling by comparing AI-generated Q-matrices with a validated one for a reading comprehension test. Results showed substantial variation across AI models, with Google Gemini 2.5 Pro achieving the highest agreement (Kappa = 0.63) with the validated Q-matrix, exceeding all human experts.

Constructing a Q-matrix is a critical but labor-intensive step in cognitive diagnostic modeling (CDM). This study investigates whether AI tools (i.e., general language models) can support Q-matrix development by comparing AI-generated Q-matrices with a validated Q-matrix from Li and Suen (2013) for a reading comprehension test. In May 2025, multiple AI models were provided with the same training materials as human experts. Agreement among AI-generated Q-matrices, the validated Q-matrix, and human raters' Q-matrices was assessed using Cohen's kappa. Results showed substantial variation across AI models, with Google Gemini 2.5 Pro achieving the highest agreement (Kappa = 0.63) with the validated Q-matrix, exceeding that of all human experts. A follow-up analysis in January 2026 using newer AI versions, however, revealed lower agreement with the validated Q-matrix. Implications and directions for future research are discussed.

View on arXiv PDF

Similar