Do we need to go Deep? Knowledge Tracing with Big Data
This work addresses the need for effective knowledge tracing in education, showing that simpler models can be more accurate than deep learning on current datasets, which is incremental as it challenges the trend towards deep models.
The study tackled the problem of predicting student performance in interactive educational systems by comparing deep learning models with traditional logistic regression on the large-scale EdNet dataset, finding that logistic regression with engineered features outperformed deep models.
Interactive Educational Systems (IES) enabled researchers to trace student knowledge in different skills and provide recommendations for a better learning path. To estimate the student knowledge and further predict their future performance, the interest in utilizing the student interaction data captured by IES to develop learner performance models is increasing rapidly. Moreover, with the advances in computing systems, the amount of data captured by these IES systems is also increasing that enables deep learning models to compete with traditional logistic models and Markov processes. However, it is still not empirically evident if these deep models outperform traditional models on the current scale of datasets with millions of student interactions. In this work, we adopt EdNet, the largest student interaction dataset publicly available in the education domain, to understand how accurately both deep and traditional models predict future student performances. Our work observes that logistic regression models with carefully engineered features outperformed deep models from extensive experimentation. We follow this analysis with interpretation studies based on Locally Interpretable Model-agnostic Explanation (LIME) to understand the impact of various features on best performing model pre-dictions.