Unleashing the Power of Shared Label Structures for Human Activity Recognition
This work addresses the challenge of improving human activity recognition for applications like healthcare or smart devices by leveraging label semantics, though it is incremental as it builds on existing HAR methods.
The paper tackles the problem of human activity recognition by modeling shared semantic structures in label names, such as common actions or objects, to improve performance across activities, especially with limited samples. The proposed SHARE framework outperforms state-of-the-art models on seven benchmark datasets, showing significant gains in few-shot and label-imbalanced settings.
Current human activity recognition (HAR) techniques regard activity labels as integer class IDs without explicitly modeling the semantics of class labels. We observe that different activity names often have shared structures. For example, "open door" and "open fridge" both have "open" as the action; "kicking soccer ball" and "playing tennis ball" both have "ball" as the object. Such shared structures in label names can be translated to the similarity in sensory data and modeling common structures would help uncover knowledge across different activities, especially for activities with limited samples. In this paper, we propose SHARE, a HAR framework that takes into account shared structures of label names for different activities. To exploit the shared structures, SHARE comprises an encoder for extracting features from input sensory time series and a decoder for generating label names as a token sequence. We also propose three label augmentation techniques to help the model more effectively capture semantic structures across activities, including a basic token-level augmentation, and two enhanced embedding-level and sequence-level augmentations utilizing the capabilities of pre-trained models. SHARE outperforms state-of-the-art HAR models in extensive experiments on seven HAR benchmark datasets. We also evaluate in few-shot learning and label imbalance settings and observe even more significant performance gap.