CVHCApr 20, 2021

Improving state-of-the-art in Detecting Student Engagement with Resnet and TCN Hybrid Network

arXiv:2104.10122v273 citations
AI Analysis

This work addresses the need for automated engagement detection to personalize learning in online education, representing an incremental improvement over prior methods.

The paper tackled the problem of detecting student engagement levels from videos in online learning by proposing a ResNet and TCN hybrid network, which outperformed existing methods and set a new state-of-the-art accuracy on the DAiSEE dataset.

Automatic detection of students' engagement in online learning settings is a key element to improve the quality of learning and to deliver personalized learning materials to them. Varying levels of engagement exhibited by students in an online classroom is an affective behavior that takes place over space and time. Therefore, we formulate detecting levels of students' engagement from videos as a spatio-temporal classification problem. In this paper, we present a novel end-to-end Residual Network (ResNet) and Temporal Convolutional Network (TCN) hybrid neural network architecture for students' engagement level detection in videos. The 2D ResNet extracts spatial features from consecutive video frames, and the TCN analyzes the temporal changes in video frames to detect the level of engagement. The spatial and temporal arms of the hybrid network are jointly trained on raw video frames of a large publicly available students' engagement detection dataset, DAiSEE. We compared our method with several competing students' engagement detection methods on this dataset. The ResNet+TCN architecture outperforms all other studied methods, improves the state-of-the-art engagement level detection accuracy, and sets a new baseline for future research.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes