CVMar 16, 2025

Domain Generalization for Improved Human Activity Recognition in Office Space Videos Using Adaptive Pre-processing

arXiv:2503.12678v1h-index: 5
Originality Incremental advance
AI Analysis

This work addresses the challenge of recognizing human activities in office environments with diverse video data sources, offering an incremental improvement for surveillance, healthcare, and robotics applications.

The paper tackled the problem of domain generalization for human activity recognition in office videos by proposing three adaptive pre-processing techniques, which significantly improved accuracy, precision, recall, and F1 scores on unseen domains compared to state-of-the-art domain adaptation methods.

Automatic video activity recognition is crucial across numerous domains like surveillance, healthcare, and robotics. However, recognizing human activities from video data becomes challenging when training and test data stem from diverse domains. Domain generalization, adapting to unforeseen domains, is thus essential. This paper focuses on office activity recognition amidst environmental variability. We propose three pre-processing techniques applicable to any video encoder, enhancing robustness against environmental variations. Our study showcases the efficacy of MViT, a leading state-of-the-art video classification model, and other video encoders combined with our techniques, outperforming state-of-the-art domain adaptation methods. Our approach significantly boosts accuracy, precision, recall and F1 score on unseen domains, emphasizing its adaptability in real-world scenarios with diverse video data sources. This method lays a foundation for more reliable video activity recognition systems across heterogeneous data domains.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes