LGCLCVAug 21, 2024

Limitations in Employing Natural Language Supervision for Sensor-Based Human Activity Recognition -- And Ways to Overcome Them

arXiv:2408.12023v111 citationsh-index: 24
Originality Incremental advance
AI Analysis

This work addresses the problem of improving human activity recognition using wearables for applications like health monitoring, though it is incremental as it builds on existing cross-modal contrastive pre-training methods.

The paper investigated using natural language supervision for wearable sensor-based human activity recognition and found it performed worse than standard methods, identifying sensor heterogeneity and lack of diverse text descriptions as key issues. They developed strategies that significantly improved recognition, bringing performance closer to supervised and self-supervised training while enabling unseen activity recognition and cross-modal retrieval.

Cross-modal contrastive pre-training between natural language and other modalities, e.g., vision and audio, has demonstrated astonishing performance and effectiveness across a diverse variety of tasks and domains. In this paper, we investigate whether such natural language supervision can be used for wearable sensor based Human Activity Recognition (HAR), and discover that-surprisingly-it performs substantially worse than standard end-to-end training and self-supervision. We identify the primary causes for this as: sensor heterogeneity and the lack of rich, diverse text descriptions of activities. To mitigate their impact, we also develop strategies and assess their effectiveness through an extensive experimental evaluation. These strategies lead to significant increases in activity recognition, bringing performance closer to supervised and self-supervised training, while also enabling the recognition of unseen activities and cross modal retrieval of videos. Overall, our work paves the way for better sensor-language learning, ultimately leading to the development of foundational models for HAR using wearables.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes