Ploy Thajchayapong

CY
h-index26
7papers
5citations
Novelty29%
AI Score45

7 Papers

AIMay 28
Surfacing Isolated Learners with Outcome-Independent Mediation of Feedback between Teachers and Students Using AI

Junsoo Park, Youssef Medhat, Htet Phyo Wai et al.

AI-augmented classrooms generate rich teacher and student feedback before graded outcomes become available, yet these signals can be difficult to translate into timely instructional decisions. We propose an interpretable decision layer: a transparent mechanism that ranks course topics requiring attention without using grades or post-hoc outcome labels. The approach combines three signals: student learning difficulty prevalence, disagreement between learner self-reports and observed difficulties, and unresolved teacher concerns. The output is a ranked set of topic priorities with per-topic decision records explaining each ranking. In one graduate CS course offering ($n=5$ instructor interviews; $n=279$ survey responses), prioritized topics aligned with instructor concerns (top-5 overlap 3/5; Spearman $ρ=0.80$) and student-reported topic difficulty ($ρ=0.46$, $p=.048$). Multi-signal integration also surfaced learners not identified through individual signal sources alone (AUC $=0.96$ vs. $0.91$ for gap prevalence alone). Reflective thinking, help-seeking, and self-efficacy provided additional evidence that student behavioral signals align with learning-related constructs. While preliminary, these findings suggest that transparent coordination mechanisms may help support human-AI co-agency when feedback is incomplete.

ETJun 1
Powering An Ecosystem Of Pedagogical AI Agents: A Validation Strategy For A Unified Data Architecture

Natalia Theodora, Ploy Thajchayapong, Ashok K. Goel

The application of AI in education has evolved from monolithic intelligent tutoring systems to a diverse ecosystem of pedagogical agents, including conversational assistants, virtual coaches, and adaptive tutors. This shift requires a unified and scalable data architecture to manage the complex information feedback loops between human instructors, learners, and the varied AI agents. The design, development, and deployment of the data architecture in turn raises a critical issue of validation. This paper addresses this critical need by describing a practical validation strategy for a high-volume data pipeline developed as part of a data architecture for AI-augmented adult learning at the National AI Institute for Adult Learning and Online Education. Our approach involves a two-stage testing methodology to ensure both functional diversity and real-world scalability. First, the QA environment uses a blend of synthetic and real-world data to validate functional correctness across various event types produced from learner and agent interactions. Following this, the production environment successfully processed a total of over 2.7 million production requests across 21 successful runs carrying authentic event data from a large-scale online program. This validation process surfaced crucial insights into data privacy, a key challenge when handling varied data from multiple AI agent data sources. By outlining a replicable testing strategy for a unified data backbone, this research offers a clear framework for institutions and developers aiming to build and support their own heterogeneous suites of AI-powered learning tools. Keywords: Pedagogical Agents, Learning Ecosystems, Data Architecture, Validation, Scalability, Learning Analytics.

CYMay 28
Generalizing a Highly Configurable Analytics Pipeline to Replicate and Support Educational Research Across Multiple Domains

Yallen Bai, Ploy Thajchayapong, Ashok Goel

Artificial intelligence assistants deployed in online learning environments create new opportunities to collect large volumes of learner interaction data and generate insights to improve student outcomes. Architecture for AI-Augmented Learning (A4L) is a modular data architecture that enables the collection, integration, and analysis of learner interaction data from educational AI systems, supporting the generation of instructional insights that facilitate personalized learning and reinforce the bidirectional feedback loop between instructors and learners. This study examines the modular design of the A4L Data Analytics Pipeline, an extensible data infrastructure that enables the ingestion, processing, and analysis of heterogeneous datasets generated by educational AI assistants. We describe the design principles and development process used to extend the pipeline's analytical capabilities while preserving flexibility across domains. We evaluate the pipeline through case studies spanning three research domains corresponding to three educational AI assistants deployed in online learning environments at Georgia Tech. Results show that a common set of statistical analysis methods can be consistently applied across datasets with differing structures and instructional contexts, enabling the pipeline to reproduce key analytical findings across domains. We demonstrate how analytical capabilities initially developed for one domain can be extended to support richer analyses in another, illustrating the pipeline's extensibility. These findings suggest that the A4L Analytics Pipeline can serve as reusable infrastructure for analyzing data generated by future educational AI assistants. By enabling analytics that can be systematically extended to new domains, the pipeline provides a foundation for deriving insights that inform the design and evaluation of educational AI systems.

CYMay 6
Guidelines for Designing AI Technologies to Support Adult Learning

Jennifer M. Reddig, Glen R. Smith, Sanaz Ahmadzadeh Siyahrood et al.

AI-powered educational technologies have demonstrated measurable benefits for learners, but their design and evaluation have largely centered on K-12 contexts. As a result, many AI-supported learning systems remain poorly aligned with the needs, constraints, and goals of adult learners. To better understand how AI systems function in adult education, this paper examines the deployment of several AI learning technologies developed within a multidisciplinary, national research institute in the United States focused on adult learning and online education. Drawing on longitudinal deployment data, we conducted a reflexive thematic analysis to identify recurring challenges and design considerations across systems. These insights were synthesized into a set of 19 design guidelines intended to inform future AI-supported adult learning technologies. We demonstrate the utility of these guidelines through a heuristic evaluation of the deployed systems. Lastly, we present a guideline exploration tool that aids in the ideation of technologies by connecting the guidelines to stakeholder statements surfaced in the analysis process.

IRApr 14
Memory-Based vs. Context-Only Conditioning Produces Distinct Behavioral Patterns in Stateful Personalization

Junsoo Park, Youssef Medhat, Htet Phyo Wai et al.

We study how conditioning context shapes personalization behavior in a teacher-facing educational recommender system. We compare contextual conditioning based on the current student question with memory-based conditioning using persistent learner information. Using deviation correlation and paired statistical tests, we find that contextual recommendations exhibit stronger question-level responsiveness, while memory-based recommendations exhibit history-dependent behaviors, including learner-specific differentiation under identical input. Teacher-facing evaluation signals suggest these recommendations are interpretable and actionable. These results indicate that embedding-based similarity metrics capture responsiveness to the current question but do not characterize personalization grounded in learner history, motivating behavior-level diagnostics for studying conditioning effects.

CLApr 7
Evaluating Learner Representations for Differentiation Prior to Instructional Outcomes

Junsoo Park, Youssef Medhat, Htet Phyo Wai et al.

Learner representations play a central role in educational AI systems, yet it is often unclear whether they preserve meaningful differences between students when instructional outcomes are unavailable or highly context-dependent. This work examines how to evaluate learner representations based on whether they retain separation between learners under a shared comparison rule. We introduce distinctiveness, a representation-level measure that evaluates how each learner differs from others in the cohort using pairwise distances, without requiring clustering, labels, or task-specific evaluation. Using student-authored questions collected through a conversational AI agent in an online learning environment, we compare representations based on individual questions with representations that aggregate patterns across a student's interactions over time. Results show that learner-level representations yield higher separation, stronger clustering structure, and more reliable pairwise discrimination than interaction-level representations. These findings demonstrate that learner representations can be evaluated independently of instructional outcomes and provide a practical pre-deployment criterion using distinctiveness as a diagnostic metric for assessing whether a representation supports differentiated modeling or personalization.

CYMay 8, 2025
A4L: An Architecture for AI-Augmented Learning

Ashok Goel, Ploy Thajchayapong, Vrinda Nandan et al.

AI promises personalized learning and scalable education. As AI agents increasingly permeate education in support of teaching and learning, there is a critical and urgent need for data architectures for collecting and analyzing data on learning, and feeding the results back to teachers, learners, and the AI agents for personalization of learning at scale. At the National AI Institute for Adult Learning and Online Education, we are developing an Architecture for AI-Augmented Learning (A4L) for supporting adult learning through online education. We present the motivations, goals, requirements of the A4L architecture. We describe preliminary applications of A4L and discuss how it advances the goals of making learning more personalized and scalable.