The Course Difficulty Analysis Cookbook
This work addresses the need for fair and reliable course difficulty assessments in educational programs, but it is incremental as it reviews and synthesizes existing methods rather than introducing new ones.
The paper tackles the problem of measuring course difficulty in curriculum analytics by providing a comprehensive review and comparison of existing approaches based on grade point averages and latent trait modeling, and it offers a tutorial and software for practical applications like monitoring difficulty over time and detecting disparate outcomes.
Curriculum analytics (CA) studies curriculum structure and student data to ensure the quality of educational programs. An essential aspect is studying course properties, which involves assigning each course a representative difficulty value. This is critical for several aspects of CA, such as quality control (e.g., monitoring variations over time), course comparisons (e.g., articulation), and course recommendation (e.g., advising). Measuring course difficulty requires careful consideration of multiple factors: First, when difficulty measures are sensitive to the performance level of enrolled students, it can bias interpretations by overlooking student diversity. By assessing difficulty independently of enrolled students' performances, we can reduce the risk of bias and enable fair, representative assessments of difficulty. Second, from a measurement theoretic perspective, the measurement must be reliable and valid to provide a robust basis for subsequent analyses. Third, difficulty measures should account for covariates, such as the characteristics of individual students within a diverse populations (e.g., transfer status). In recent years, various notions of difficulty have been proposed. This paper provides the first comprehensive review and comparison of existing approaches for assessing course difficulty based on grade point averages and latent trait modeling. It further offers a hands-on tutorial on model selection, assumption checking, and practical CA applications. These applications include monitoring course difficulty over time and detecting courses with disparate outcomes between distinct groups of students (e.g., dropouts vs. graduates), ultimately aiming to promote high-quality, fair, and equitable learning experiences. To support further research and application, we provide an open-source software package and artificial datasets, facilitating reproducibility and adoption.