Early-Warning Learner Satisfaction Forecasting in MOOCs via Temporal Event Transformers and LLM Text Embeddings
For MOOC platforms, this enables early intervention to improve retention and engagement by predicting satisfaction before course completion.
The paper introduces TET-LLM, a multi-modal framework combining temporal event transformers and LLM text embeddings to forecast learner satisfaction in MOOCs using only early course signals (e.g., first 7 days). It achieves RMSE of 0.82 and AUC of 0.77 at the 7-day horizon, outperforming baselines.
Learner satisfaction is a critical quality signal in massive open online courses (MOOCs), directly influencing retention, engagement, and platform reputation. Most existing methods infer satisfaction \emph{post hoc} from end-of-course reviews and star ratings, which are too late for effective intervention. In this paper, we study \textbf{early-warning satisfaction forecasting}: predicting a learner's eventual satisfaction score using only signals observed in the first $t$ days of a course (e.g., $t\!\in\!\{7, 14, 28\}$). We propose \textbf{TET-LLM}, a multi-modal fusion framework that combines (i) a \emph{temporal event Transformer} over fine-grained behavioral event sequences, (ii) \emph{LLM-based contextual embeddings} extracted from early textual traces such as forum posts and short feedback, and (iii) short-text \emph{topic/aspect distributions} to capture coarse satisfaction drivers. A heteroscedastic regression head outputs both a point estimate and a predictive uncertainty score, enabling conservative intervention policies. Comprehensive experiments on a large-scale multi-platform MOOC dataset demonstrate that TET-LLM consistently outperforms aggregate-feature and text-only baselines across all early-horizon settings, achieving an RMSE of 0.82 and AUC of 0.77 at the 7-day horizon. Ablation studies confirm the complementary contribution of each modality, and uncertainty calibration analysis shows near-nominal 90\% interval coverage.