Trong-Sinh Vu

3.6CVJul 30, 2025Code

Exploring the Application of Visual Question Answering (VQA) for Classroom Activity Monitoring

Sinh Trong Vu, Hieu Trung Pham, Dung Manh Nguyen et al.

Classroom behavior monitoring is a critical aspect of educational research, with significant implications for student engagement and learning outcomes. Recent advancements in Visual Question Answering (VQA) models offer promising tools for automatically analyzing complex classroom interactions from video recordings. In this paper, we investigate the applicability of several state-of-the-art open-source VQA models, including LLaMA2, LLaMA3, QWEN3, and NVILA, in the context of classroom behavior analysis. To facilitate rigorous evaluation, we introduce our BAV-Classroom-VQA dataset derived from real-world classroom video recordings at the Banking Academy of Vietnam. We present the methodology for data collection, annotation, and benchmark the performance of the selected VQA models on this dataset. Our initial experimental results demonstrate that all four models achieve promising performance levels in answering behavior-related visual questions, showcasing their potential in future classroom analytics and intervention systems.

1.6CLNov 16, 2017

ConvAMR: Abstract meaning representation parsing for legal document

Lai Dac Viet, Vu Trong Sinh, Nguyen Le Minh et al.

Convolutional neural networks (CNN) have recently achieved remarkable performance in a wide range of applications. In this research, we equip convolutional sequence-to-sequence (seq2seq) model with an efficient graph linearization technique for abstract meaning representation parsing. Our linearization method is better than the prior method at signaling the turn of graph traveling. Additionally, convolutional seq2seq model is more appropriate and considerably faster than the recurrent neural network models in this task. Our method outperforms previous methods by a large margin on both the standard dataset LDC2014T12. Our result indicates that future works still have a room for improving parsing model using graph linearization approach.

Trong-Sinh Vu

2 Papers