HCAILGNov 26, 2024

AI2T: Building Trustable AI Tutors by Interactively Teaching a Self-Aware Learning Agent

arXiv:2411.17924v16 citationsh-index: 16
Originality Highly original
AI Analysis

This addresses the challenge of data-efficient and trustable authoring for complex ITSs, which is incremental as it builds on prior authoring-by-tutoring methods but with improved reliability and self-awareness.

The paper tackles the problem of building intelligent tutoring systems (ITSs) by introducing AI2T, an interactively teachable AI that learns robust rules for step-by-step solution tracking from just 20-30 minutes of training, reducing the typical programming effort from 200-300 hours per hour of instruction. It uses a self-aware precondition learning algorithm, STAND, which outperforms state-of-the-art methods like XGBoost and produces more reliable programs than hallucination-prone LLMs and prior approaches.

AI2T is an interactively teachable AI for authoring intelligent tutoring systems (ITSs). Authors tutor AI2T by providing a few step-by-step solutions and then grading AI2T's own problem-solving attempts. From just 20-30 minutes of interactive training, AI2T can induce robust rules for step-by-step solution tracking (i.e., model-tracing). As AI2T learns it can accurately estimate its certainty of performing correctly on unseen problem steps using STAND: a self-aware precondition learning algorithm that outperforms state-of-the-art methods like XGBoost. Our user study shows that authors can use STAND's certainty heuristic to estimate when AI2T has been trained on enough diverse problems to induce correct and complete model-tracing programs. AI2T-induced programs are more reliable than hallucination-prone LLMs and prior authoring-by-tutoring approaches. With its self-aware induction of hierarchical rules, AI2T offers a path toward trustable data-efficient authoring-by-tutoring for complex ITSs that normally require as many as 200-300 hours of programming per hour of instruction.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes