36.1AIApr 14
Edu-MMBias: A Three-Tier Multimodal Benchmark for Auditing Social Bias in Vision-Language Models under Educational ContextsRuijia Li, Mingzi Zhang, Zengyi Yu et al.
As Vision-Language Models (VLMs) become integral to educational decision-making, ensuring their fairness is paramount. However, current text-centric evaluations neglect the visual modality, leaving an unregulated channel for latent social biases. To bridge this gap, we present Edu-MMBias, a systematic auditing framework grounded in the tri-component model of attitudes from social psychology. This framework diagnoses bias across three hierarchical dimensions: cognitive, affective, and behavioral. Utilizing a specialized generative pipeline that incorporates a self-correct mechanism and human-in-the-loop verification, we synthesize contamination-resistant student profiles to conduct a holistic stress test on state-of-the-art VLMs. Our extensive audit reveals critical, counter-intuitive patterns: models exhibit a compensatory class bias favoring lower-status narratives while simultaneously harboring deep-seated health and racial stereotypes. Crucially, we find that visual inputs act as a safety backdoor, triggering a resurgence of biases that bypass text-based alignment safeguards and revealing a systematic misalignment between latent cognition and final decision-making. The contributions of this paper are available at: https://anonymous.4open.science/r/EduMMBias-63B2.
CLSep 29, 2025Code
DiaCDM: Cognitive Diagnosis in Teacher-Student Dialogues using the Initiation-Response-Evaluation FrameworkRui Jia, Yuang Wei, Ruijia Li et al.
While cognitive diagnosis (CD) effectively assesses students' knowledge mastery from structured test data, applying it to real-world teacher-student dialogues presents two fundamental challenges. Traditional CD models lack a suitable framework for handling dynamic, unstructured dialogues, and it's difficult to accurately extract diagnostic semantics from lengthy dialogues. To overcome these hurdles, we propose DiaCDM, an innovative model. We've adapted the initiation-response-evaluation (IRE) framework from educational theory to design a diagnostic framework tailored for dialogue. We also developed a unique graph-based encoding method that integrates teacher questions with relevant knowledge components to capture key information more precisely. To our knowledge, this is the first exploration of cognitive diagnosis in a dialogue setting. Experiments on three real-world dialogue datasets confirm that DiaCDM not only significantly improves diagnostic accuracy but also enhances the results' interpretability, providing teachers with a powerful tool for assessing students' cognitive states. The code is available at https://github.com/Mind-Lab-ECNU/DiaCDM/tree/main.
51.9AIMar 30
SLOW: Strategic Logical-inference Open Workspace for Cognitive Adaptation in AI TutoringYuang Wei, Ruijia Li, Bo Jiang
While Large Language Models (LLMs) have demonstrated remarkable fluency in educational dialogues, most generative tutors primarily operate through intuitive, single-pass generation. This reliance on fast thinking precludes a dedicated reasoning workspace, forcing multiple diagnostic and strategic signals to be processed in a conflated manner. As a result, learner cognitive diagnosis, affective perception, and pedagogical decision-making become tightly entangled, which limits the tutoring system's capacity for deliberate instructional adaptation. We propose SLOW, a theory-informed tutoring framework that supports deliberate learner-state reasoning within a transparent decision workspace. Inspired by dual-process accounts of human tutoring, SLOW explicitly separates learner-state inference from instructional action selection. The framework integrates causal evidence parsing from learner language, fuzzy cognitive diagnosis with counterfactual stability analysis, and prospective affective reasoning to anticipate how instructional choices may influence learners' emotional trajectories. These signals are jointly considered to guide pedagogically and affectively aligned tutoring strategies. Evaluation using hybrid human-AI judgments demonstrates significant improvements in personalization, emotional sensitivity, and clarity. Ablation studies further confirm the necessity of each module, showcasing how SLOW enables interpretable and reliable intelligent tutoring through a visualized decision-making process. This work advances the interpretability and educational validity of LLM-based adaptive instruction.
HCMar 6
Beyond Scores: Explainable Intelligent Assessment Strengthens Pre-service Teachers' Assessment LiteracyYuang Wei, Fei Wang, Yifan Zhang et al.
Assessment literacy (AL) is essential for personalized education, yet difficult to cultivate in pre-service teachers. Conventional teacher preparation programs focus on theoretical knowledge, while digital assessment tools commonly provide opaque scores or parameters. These limitations hinder reflection and transfer, leaving AL underdeveloped. We propose XIA, an eXplainable Intelligent Assessment platform that extends statistics-informed support with visualized cognitive diagnostic reasoning, including contrastive and counterfactual explanations. In a pre-post controlled study with 21 pre-service teachers, we combined quantitative tasks and questionnaires with qualitative interviews. The findings offer preliminary evidence that XIA supported reflection, self-regulation, and assessment awareness, and helped reduce assessment errors. Interviews further showed a shift from score-based judgments toward evidence-based reasoning. This work contributes insights into the design of intelligent assessment tools, showing how explanatory scaffolding can bridge assessment theory and classroom practice and support the cultivation of AL in teacher education.
MADec 30, 2024
AI Agent for Education: von Neumann Multi-Agent System FrameworkYuan-Hao Jiang, Ruijia Li, Yizhou Zhou et al.
The development of large language models has ushered in new paradigms for education. This paper centers on the multi-Agent system in education and proposes the von Neumann multi-Agent system framework. It breaks down each AI Agent into four modules: control unit, logic unit, storage unit, and input-output devices, defining four types of operations: task deconstruction, self-reflection, memory processing, and tool invocation. Furthermore, it introduces related technologies such as Chain-of-Thought, Reson+Act, and Multi-Agent Debate associated with these four types of operations. The paper also discusses the ability enhancement cycle of a multi-Agent system for education, including the outer circulation for human learners to promote knowledge construction and the inner circulation for LLM-based-Agents to enhance swarm intelligence. Through collaboration and reflection, the multi-Agent system can better facilitate human learners' learning and enhance their teaching abilities in this process.
CYDec 5, 2024
Learning-by-teaching with ChatGPT: The effect of teachable ChatGPT agent on programming educationAngxuan Chen, Yuang Wei, Huixiao Le et al.
This study investigates the potential of using ChatGPT as a teachable agent to support students' learning by teaching process, specifically in programming education. While learning by teaching is an effective pedagogical strategy for promoting active learning, traditional teachable agents have limitations, particularly in facilitating natural language dialogue. Our research explored whether ChatGPT, with its ability to engage learners in natural conversations, can support this process. The findings reveal that interacting with ChatGPT improves students' knowledge gains and programming abilities, particularly in writing readable and logically sound code. However, it had limited impact on developing learners' error-correction skills, likely because ChatGPT tends to generate correct code, reducing opportunities for students to practice debugging. Additionally, students' self-regulated learning (SRL) abilities improved, suggesting that teaching ChatGPT fosters learners' higher self-efficacy and better implementation of SRL strategies. This study discussed the role of natural dialogue in fostering socialized learning by teaching, and explored ChatGPT's specific contributions in supporting students' SRL through the learning by teaching process. Overall, the study highlights ChatGPT's potential as a teachable agent, offering insights for future research on ChatGPT-supported education.
LGMay 20, 2025
MAS-KCL: Knowledge component graph structure learning with large language model-based agentic workflowYuan-Hao Jiang, Kezong Tang, Zi-Wei Chen et al.
Knowledge components (KCs) are the fundamental units of knowledge in the field of education. A KC graph illustrates the relationships and dependencies between KCs. An accurate KC graph can assist educators in identifying the root causes of learners' poor performance on specific KCs, thereby enabling targeted instructional interventions. To achieve this, we have developed a KC graph structure learning algorithm, named MAS-KCL, which employs a multi-agent system driven by large language models for adaptive modification and optimization of the KC graph. Additionally, a bidirectional feedback mechanism is integrated into the algorithm, where AI agents leverage this mechanism to assess the value of edges within the KC graph and adjust the distribution of generation probabilities for different edges, thereby accelerating the efficiency of structure learning. We applied the proposed algorithm to 5 synthetic datasets and 4 real-world educational datasets, and experimental results validate its effectiveness in learning path recognition. By accurately identifying learners' learning paths, teachers are able to design more comprehensive learning plans, enabling learners to achieve their educational goals more effectively, thus promoting the sustainable development of education.
AIDec 1, 2024
The Advancement of Personalized Learning Potentially Accelerated by Generative AIYuang Wei, Yuan-Hao Jiang, Jiayi Liu et al.
The rapid development of Generative AI (GAI) has sparked revolutionary changes across various aspects of education. Personalized learning, a focal point and challenge in educational research, has also been influenced by the development of GAI. To explore GAI's extensive impact on personalized learning, this study investigates its potential to enhance various facets of personalized learning through a thorough analysis of existing research. The research comprehensively examines GAI's influence on personalized learning by analyzing its application across different methodologies and contexts, including learning strategies, paths, materials, environments, and specific analyses within the teaching and learning processes. Through this in-depth investigation, we find that GAI demonstrates exceptional capabilities in providing adaptive learning experiences tailored to individual preferences and needs. Utilizing different forms of GAI across various subjects yields superior learning outcomes. The article concludes by summarizing scenarios where GAI is applicable in educational processes and discussing strategies for leveraging GAI to enhance personalized learning, aiming to guide educators and learners in effectively utilizing GAI to achieve superior learning objectives.
AIJun 25, 2024
Enhancing Explainability of Knowledge Learning Paths: Causal Knowledge NetworksYuang Wei, Yizhou Zhou, Yuan-Hao Jiang et al.
A reliable knowledge structure is a prerequisite for building effective adaptive learning systems and intelligent tutoring systems. Pursuing an explainable and trustworthy knowledge structure, we propose a method for constructing causal knowledge networks. This approach leverages Bayesian networks as a foundation and incorporates causal relationship analysis to derive a causal network. Additionally, we introduce a dependable knowledge-learning path recommendation technique built upon this framework, improving teaching and learning quality while maintaining transparency in the decision-making process.
CLAug 20, 2019
GeoSQA: A Benchmark for Scenario-based Question Answering in the Geography Domain at High School LevelZixian Huang, Yulin Shen, Xiao Li et al.
Scenario-based question answering (SQA) has attracted increasing research attention. It typically requires retrieving and integrating knowledge from multiple sources, and applying general knowledge to a specific case described by a scenario. SQA widely exists in the medical, geography, and legal domains---both in practice and in the exams. In this paper, we introduce the GeoSQA dataset. It consists of 1,981 scenarios and 4,110 multiple-choice questions in the geography domain at high school level, where diagrams (e.g., maps, charts) have been manually annotated with natural language descriptions to benefit NLP research. Benchmark results on a variety of state-of-the-art methods for question answering, textual entailment, and reading comprehension demonstrate the unique challenges presented by SQA for future research.