Event-based Action Recognition Using Timestamp Image Encoding Network
This work addresses action recognition for applications like robotics or surveillance using event cameras, offering an incremental improvement by adapting standard computer vision tools to this sensor type.
The paper tackles action recognition using event camera data by proposing a timestamp image encoding 2D network to process spatial-temporal information, achieving performance comparable to RGB-based benchmarks on real-world action recognition and state-of-the-art results on gesture recognition.
Event camera is an asynchronous, high frequency vision sensor with low power consumption, which is suitable for human action recognition task. It is vital to encode the spatial-temporal information of event data properly and use standard computer vision tool to learn from the data. In this work, we propose a timestamp image encoding 2D network, which takes the encoded spatial-temporal images of the event data as input and output the action label. Experiment results show that our method can achieve the same level of performance as those RGB-based benchmarks on real world action recognition, and also achieve the SOTA result on gesture recognition.