CLMay 27

Error as a Lens: Probing LLM Reasoning through Synthetic Misconception Generation

arXiv:2605.2900730.8
AI Analysis

For education researchers and AI developers needing synthetic error datasets for personalized tutoring and teacher training, this framework provides a method to generate targeted misconceptions without real student data.

The paper presents a framework for generating synthetic student errors targeted to a five-class taxonomy based on Bloom's taxonomy, evaluated on TheoremQA questions. The framework uses a Generation Agent and an Examination Agent to produce class-consistent incorrect answers, showing that targeted error generation is harder than free-form incorrect answer generation.

Personalized tutoring, teacher training, and education research need access to \emph{targeted} synthetic misconceptions, but privacy and IRB constraints make labelled corpora of real student errors scarce. LLMs could in principle generate synthetic errors at scale, but producing an arbitrary wrong answer is easy for a modern LLM while producing one that matches a specified cognitive failure mode is much harder. We present a framework that generates errors targeted to a five-class taxonomy adapted from the revised Bloom's taxonomy, evaluated on questions from the TheoremQA dataset. A Generation Agent (GA) drafts a candidate erroneous solution conditioned on a target class, and an Examination Agent (EA) judges whether the draft is incorrect and class-consistent. The framework yields a reusable recipe for building class-stratified synthetic error datasets where authentic student corpora are unavailable. As a secondary diagnostic, targeted error generation is substantially harder than free-form incorrect-answer generation, and answer-grounding contributes more than expanded examples or external textbook content.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes