CLAIMay 25

Toward a Benchmark for Controllable Simulation of Imperfect Students with Large Language Models

arXiv:2605.2560117.6
AI Analysis

For teacher education, this work addresses the need for controllable student simulators to practice instructional responses, though the approach is incremental and model-dependent.

The paper investigates whether large language models can be prompted to simulate students with specific skill profiles (partial mastery) for teacher education. Results show selective partial mastery can be induced and measured in mathematics, but controllability is model-dependent.

Teacher education requires deliberate practice with learners who exhibit identifiable strengths, weaknesses, and partial mastery. Large language models could support such practice by simulating students with known skill components, enabling teachers to rehearse explanations, diagnoses, and instructional responses. For this purpose, however, the central requirement is neither to maximize benchmark accuracy nor to suppress isolated facts, but to control model behavior so that it reflects a specified skill profile. This paper investigates whether prompted language models can be steered to retain some skills while suppressing others. We introduce a benchmark-oriented framework in which an explicit skill vector represents a simulated student, prompt-based control specifies retained and missing competencies, and behavior is evaluated using profile-alignment metrics, retained-versus-forgotten comparisons, and cross-skill calibration analyses. The results show that selective partial mastery can be induced and measured in a structured mathematics setting, although the degree of controllability remains model-dependent. These findings position controllable learner simulation as a distinct research problem at the intersection of teacher education, educational simulation, and language-model control.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes