SEAICLCYOct 9, 2025

McMining: Automated Discovery of Misconceptions in Student Code

arXiv:2510.08827v11 citationsh-index: 9
Originality Incremental advance
AI Analysis

This addresses the issue of identifying student misconceptions in programming education, which can hinder learning and code quality, but it is incremental as it builds on existing LLM capabilities for a new application.

The paper tackles the problem of automatically discovering programming misconceptions in student code, introducing the McMining task and showing that LLM-based approaches from Gemini, Claude, and GPT families are effective at this.

When learning to code, students often develop misconceptions about various programming language concepts. These can not only lead to bugs or inefficient code, but also slow down the learning of related concepts. In this paper, we introduce McMining, the task of mining programming misconceptions from samples of code from a student. To enable the training and evaluation of McMining systems, we develop an extensible benchmark dataset of misconceptions together with a large set of code samples where these misconceptions are manifested. We then introduce two LLM-based McMiner approaches and through extensive evaluations show that models from the Gemini, Claude, and GPT families are effective at discovering misconceptions in student code.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes