Self-supervised Human Activity Recognition by Learning to Predict Cross-Dimensional Motion
This work addresses activity recognition for mobile health and fitness applications, but it is incremental as it builds on existing self-supervised techniques.
The paper tackled human activity recognition using smartphone accelerometer data by proposing a self-supervised learning method that predicts cross-dimensional motion, and it achieved state-of-the-art results on three public datasets.
We propose the use of self-supervised learning for human activity recognition with smartphone accelerometer data. Our proposed solution consists of two steps. First, the representations of unlabeled input signals are learned by training a deep convolutional neural network to predict a segment of accelerometer values. Our model exploits a novel scheme to leverage past and present motion in x and y dimensions, as well as past values of the z axis to predict values in the z dimension. This cross-dimensional prediction approach results in effective pretext training with which our model learns to extract strong representations. Next, we freeze the convolution blocks and transfer the weights to our downstream network aimed at human activity recognition. For this task, we add a number of fully connected layers to the end of the frozen network and train the added layers with labeled accelerometer signals to learn to classify human activities. We evaluate the performance of our method on three publicly available human activity datasets: UCI HAR, MotionSense, and HAPT. The results show that our approach outperforms the existing methods and sets new state-of-the-art results.