WEAR: An Outdoor Sports Dataset for Wearable and Egocentric Activity Recognition
This dataset addresses the scarcity of multimodal data for wearable and egocentric activity recognition in outdoor sports, providing a resource for researchers, but it is incremental as it builds on existing trends in real-world applications.
The authors introduced WEAR, an outdoor sports dataset with synchronized egocentric video and inertial sensor data for human activity recognition, showing that each modality offers complementary strengths and weaknesses in prediction performance, with benchmark results indicating improved accuracy through fusion.
Research has shown the complementarity of camera- and inertial-based data for modeling human activities, yet datasets with both egocentric video and inertial-based sensor data remain scarce. In this paper, we introduce WEAR, an outdoor sports dataset for both vision- and inertial-based human activity recognition (HAR). Data from 22 participants performing a total of 18 different workout activities was collected with synchronized inertial (acceleration) and camera (egocentric video) data recorded at 11 different outside locations. WEAR provides a challenging prediction scenario in changing outdoor environments using a sensor placement, in line with recent trends in real-world applications. Benchmark results show that through our sensor placement, each modality interestingly offers complementary strengths and weaknesses in their prediction performance. Further, in light of the recent success of single-stage Temporal Action Localization (TAL) models, we demonstrate their versatility of not only being trained using visual data, but also using raw inertial data and being capable to fuse both modalities by means of simple concatenation. The dataset and code to reproduce experiments is publicly available via: mariusbock.github.io/wear/.