CEApr 14

Early-Warning Learner Satisfaction Forecasting in MOOCs via Temporal Event Transformers and LLM Text Embeddings

arXiv:2604.132414.7h-index: 2

Predicted impact top 92% in CE · last 90 daysOriginality Incremental advance

AI Analysis

For MOOC platforms, this enables early intervention to improve retention and engagement by predicting satisfaction before course completion.

The paper introduces TET-LLM, a multi-modal framework combining temporal event transformers and LLM text embeddings to forecast learner satisfaction in MOOCs using only early course signals (e.g., first 7 days). It achieves RMSE of 0.82 and AUC of 0.77 at the 7-day horizon, outperforming baselines.

Learner satisfaction is a critical quality signal in massive open online courses (MOOCs), directly influencing retention, engagement, and platform reputation. Most existing methods infer satisfaction \emph{post hoc} from end-of-course reviews and star ratings, which are too late for effective intervention. In this paper, we study \textbf{early-warning satisfaction forecasting}: predicting a learner's eventual satisfaction score using only signals observed in the first $t$ days of a course (e.g., $t\!\in\!\{7, 14, 28\}$). We propose \textbf{TET-LLM}, a multi-modal fusion framework that combines (i) a \emph{temporal event Transformer} over fine-grained behavioral event sequences, (ii) \emph{LLM-based contextual embeddings} extracted from early textual traces such as forum posts and short feedback, and (iii) short-text \emph{topic/aspect distributions} to capture coarse satisfaction drivers. A heteroscedastic regression head outputs both a point estimate and a predictive uncertainty score, enabling conservative intervention policies. Comprehensive experiments on a large-scale multi-platform MOOC dataset demonstrate that TET-LLM consistently outperforms aggregate-feature and text-only baselines across all early-horizon settings, achieving an RMSE of 0.82 and AUC of 0.77 at the 7-day horizon. Ablation studies confirm the complementary contribution of each modality, and uncertainty calibration analysis shows near-nominal 90\% interval coverage.

View on arXiv PDF

Similar