Privacy-Preserving Human Activity Recognition from Extreme Low Resolution
It addresses privacy concerns in computer vision systems like robots by enabling activity recognition without invasive video recording, though it is incremental as it builds on existing super-resolution concepts.
The paper tackles the problem of recognizing human activities from extreme low-resolution videos to balance recognition with privacy protection, introducing inverse super resolution to generate multiple low-resolution training videos from high-resolution ones and confirming its benefit experimentally.
Privacy protection from surreptitious video recordings is an important societal challenge. We desire a computer vision system (e.g., a robot) that can recognize human activities and assist our daily life, yet ensure that it is not recording video that may invade our privacy. This paper presents a fundamental approach to address such contradicting objectives: human activity recognition while only using extreme low-resolution (e.g., 16x12) anonymized videos. We introduce the paradigm of inverse super resolution (ISR), the concept of learning the optimal set of image transformations to generate multiple low-resolution (LR) training videos from a single video. Our ISR learns different types of sub-pixel transformations optimized for the activity classification, allowing the classifier to best take advantage of existing high-resolution videos (e.g., YouTube videos) by creating multiple LR training videos tailored for the problem. We experimentally confirm that the paradigm of inverse super resolution is able to benefit activity recognition from extreme low-resolution videos.