CVIVMay 29, 2020

IMUTube: Automatic Extraction of Virtual on-body Accelerometry from Video for Human Activity Recognition

arXiv:2006.05675v2106 citations
Originality Incremental advance
AI Analysis

This addresses data scarcity for researchers in human activity recognition, though it appears incremental as it integrates existing techniques.

The paper tackles the lack of labeled data for on-body sensor-based human activity recognition by introducing IMUTube, an automated pipeline that converts videos into virtual IMU accelerometry data, which improves model performance on known datasets.

The lack of large-scale, labeled data sets impedes progress in developing robust and generalized predictive models for on-body sensor-based human activity recognition (HAR). Labeled data in human activity recognition is scarce and hard to come by, as sensor data collection is expensive, and the annotation is time-consuming and error-prone. To address this problem, we introduce IMUTube, an automated processing pipeline that integrates existing computer vision and signal processing techniques to convert videos of human activity into virtual streams of IMU data. These virtual IMU streams represent accelerometry at a wide variety of locations on the human body. We show how the virtually-generated IMU data improves the performance of a variety of models on known HAR datasets. Our initial results are very promising, but the greater promise of this work lies in a collective approach by the computer vision, signal processing, and activity recognition communities to extend this work in ways that we outline. This should lead to on-body, sensor-based HAR becoming yet another success story in large-dataset breakthroughs in recognition.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes