CLOct 21, 2025

DeBERTa-KC: A Transformer-Based Classifier for Knowledge Construction in Online Learning Discourse

arXiv:2510.19858v1h-index: 8
Originality Synthesis-oriented
AI Analysis

It addresses the need for scalable, theory-informed tools to assess epistemic engagement in informal digital learning environments, though it is incremental in applying existing methods to a new domain.

This study tackled the problem of automatically classifying knowledge construction levels in online science learning discourse by developing DeBERTa-KC, a transformer-based model, which achieved a macro-F1 score of 0.836 ± 0.008, significantly outperforming baselines.

This study presents DeBERTa-KC, a transformer-based model for automatic classification of knowledge construction (KC) levels in online science learning discourse. Using comments collected from four popular YouTube science channels (2022--2024), a balanced corpus of 20,000 manually annotated samples was created across four KC categories: \textit{nonKC}, \textit{Share}, \textit{Explore}, and \textit{Negotiate}. The proposed model extends DeBERTa-v3 with Focal Loss, Label Smoothing, and R-Drop regularization to address class imbalance and enhance generalization. A reproducible end-to-end pipeline was implemented, encompassing data extraction, annotation, preprocessing, training, and evaluation. Across 10-fold stratified cross-validation, DeBERTa-KC achieved a macro-F1 of $0.836 \pm 0.008$, significantly out-performing both classical and transformer baselines ($p<0.01$). Per-category results indicate strong sensitivity to higher-order epistemic engagement, particularly in \textit{Explore} and \textit{Negotiate} discourse. These findings demonstrate that large language models can effectively capture nuanced indicators of knowledge construction in informal digital learning environments, offering scalable, theory-informed approaches to discourse analysis and the development of automated tools for assessing epistemic engagement.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes