Unsupervised Statistical Feature-Guided Diffusion Model for Sensor-based Human Activity Recognition
This addresses the data scarcity problem for wearable sensor-based human activity recognition applications, such as health and industry, but is incremental as it builds on diffusion models for a specific domain.
The paper tackled the scarcity of labeled sensor data for human activity recognition by proposing an unsupervised statistical feature-guided diffusion model to generate synthetic labeled time-series data, which improved HAR performance and outperformed existing methods in experiments.
Human activity recognition (HAR) from on-body sensors is a core functionality in many AI applications: from personal health, through sports and wellness to Industry 4.0. A key problem holding up progress in wearable sensor-based HAR, compared to other ML areas, such as computer vision, is the unavailability of diverse and labeled training data. Particularly, while there are innumerable annotated images available in online repositories, freely available sensor data is sparse and mostly unlabeled. We propose an unsupervised statistical feature-guided diffusion model specifically optimized for wearable sensor-based human activity recognition with devices such as inertial measurement unit (IMU) sensors. The method generates synthetic labeled time-series sensor data without relying on annotated training data. Thereby, it addresses the scarcity and annotation difficulties associated with real-world sensor data. By conditioning the diffusion model on statistical information such as mean, standard deviation, Z-score, and skewness, we generate diverse and representative synthetic sensor data. We conducted experiments on public human activity recognition datasets and compared the method to conventional oversampling and state-of-the-art generative adversarial network methods. Experimental results demonstrate that this can improve the performance of human activity recognition and outperform existing techniques.