HCMar 30

Simulating Novice Students Using Machine Unlearning and Relearning in Large Language Models

arXiv:2603.2614279.71 citationsh-index: 4

AI Analysis

This addresses the challenge of maintaining credible novice simulations for studying learning-by-teaching pedagogy, particularly in educational domains like programming, though it is incremental as it builds on existing unlearning methods.

The paper tackled the problem of AI-simulated novice students drifting to expert levels in learning-by-teaching systems by proposing a machine unlearning approach to create stable novice agents, showing that unlearning produces more novice-like responses and agents recover measurable knowledge under structured exposure.

Student simulation can support learning-by-teaching pedagogy where human students (as tutors) teach AI-simulated novice students (as tutees). Recent research often relies on prompt engineering with large language models (LLMs) to simulate novice student behaviour, but it is difficult to keep the AI-simulated student at a stable novice knowledge level. A key reason is that many LLMs are trained to be broadly capable, so even when prompted to "act like a novice," the LLMs can still produce expert-level explanations during the learning-by-teaching interaction process. As a result, the AI-simulated student may drift beyond the intended knowledge level, reducing the credibility of the simulation for studying learning-by-teaching processes. Thus, we propose a knowledge-level simulation approach based on machine unlearning. We investigate this approach using a dataset of multiple-choice questions on Python programming concepts. We apply machine unlearning to transform a knowledgeable LLM into a novice-level AI student (i.e., teachable agent), then evaluate whether the teachable agent can relearn targeted knowledge components through learning-by-teaching dialogue interactions. Finally, we analyse the dialogue logs to characterise how the agent's behaviour changes over time, including its question asking, error patterns, and responsiveness to instruction. The results show that (1) unlearning produces simulated student agents with more novice-like responses than prompt-only baselines, (2) the agents recover a measurable portion of the unlearned knowledge under structured exposure, and (3) dialogue analyses reveal identifiable trajectories of conceptual change and teaching moves that predict learning recovery.

View on arXiv PDF

Similar