CVJun 25, 2024

Dark Transformer: A Video Transformer for Action Recognition in the Dark

arXiv:2407.12805v21 citations

Originality Incremental advance

AI Analysis

It addresses the problem of recognizing human actions in adverse lighting for applications such as surveillance and nighttime driving, representing an incremental advance by extending existing video transformers to low-light domains.

The paper tackles action recognition in low-light conditions by introducing Dark Transformer, a video transformer that learns cross-domain spatiotemporal representations, achieving state-of-the-art performance on datasets like InFAR, XD145, and ARID.

Recognizing human actions in adverse lighting conditions presents significant challenges in computer vision, with wide-ranging applications in visual surveillance and nighttime driving. Existing methods tackle action recognition and dark enhancement separately, limiting the potential for end-to-end learning of spatiotemporal representations for video action classification. This paper introduces Dark Transformer, a novel video transformer-based approach for action recognition in low-light environments. Dark Transformer leverages spatiotemporal self-attention mechanisms in cross-domain settings to enhance cross-domain action recognition. By extending video transformers to learn cross-domain knowledge, Dark Transformer achieves state-of-the-art performance on benchmark action recognition datasets, including InFAR, XD145, and ARID. The proposed approach demonstrates significant promise in addressing the challenges of action recognition in adverse lighting conditions, offering practical implications for real-world applications.

View on arXiv PDF

Similar