HC CVMay 2

EduGage: Methods and Dataset for Sensor-Based Momentary Assessment of Engagement in Self-Guided Video Learning

Zikang Leng, Edan Eyal, Yingtian Shi, Jiaman He, Yaqi Liu, Thomas Plötz

arXiv:2605.0123832.1h-index: 7

Predicted impact top 59% in HC · last 90 daysOriginality Incremental advance

AI Analysis

For researchers and developers of adaptive learning systems, this work provides a benchmark dataset and shows that lightweight sensor combinations can effectively estimate engagement, though the small sample size (16 participants) limits generalizability.

The paper introduces EduGage, a multimodal sensor dataset for estimating learner engagement in self-guided video learning, and demonstrates that a model combining behavioral and physiological signals achieves an MAE of 0.81 and 73.93% binary accuracy, outperforming sensor-free and deep learning baselines.

Engagement, which links to attentional, emotional, and cognitive dimensions, plays an important role in learning. In online and video-based learning environments, learners often need to regulate their own interactions with instructional materials. Measuring and reflecting on engagement can therefore support both learners and adaptive learning systems. In this study, we use wearable and camera-based sensing devices to collect physiological and motion signals, including PPG, ECG, EDA, EEG, IMU, heart rate, temperature, and eye-tracking data, to estimate learner engagement. We conducted a user study with 16 participants in a video-based learning scenario, where participants completed learning tasks and provided repeated in-situ self-reports of engagement through brief probes. We develop and evaluate a system for engagement estimation, compare different sensing modalities, and further analyze the feasibility and effectiveness of multimodal modeling for characterizing learner engagement. Across participant-based cross-validation, our model achieves an MAE of 0.81, 83.75% within-1 accuracy, 73.93% binary accuracy, and 68.45% binary Macro-F1, outperforming sensor-free, statistical, deep temporal, foundation-model, and LLM-based baselines. Our results suggest that fine-grained engagement estimation is feasible but inherently noisy, and that practical systems should prioritize lightweight combinations of behavioral and physiological signals over full multimodal instrumentation. We release the EduGage dataset, including synchronized multimodal sensor signals, probe-aligned momentary engagement labels, video metadata, quizzes, and study materials, to support reproducible research on fine-grained sensor-based engagement modeling in self-guided learning.

View on arXiv PDF

Similar