CVAug 3, 2017

Extreme Low Resolution Activity Recognition with Multi-Siamese Embedding Learning

arXiv:1708.00999v279 citations
Originality Incremental advance
AI Analysis

This work addresses activity recognition for privacy-preserving or distant scenarios, but it is incremental as it builds on existing low-resolution recognition approaches.

The paper tackles the problem of recognizing human activities from extreme low resolution videos (e.g., 16x12) by designing a two-stream multi-Siamese convolutional neural network to learn transform-robust representations, and it outperforms previous state-of-the-art methods on two public datasets by a meaningful margin.

This paper presents an approach for recognizing human activities from extreme low resolution (e.g., 16x12) videos. Extreme low resolution recognition is not only necessary for analyzing actions at a distance but also is crucial for enabling privacy-preserving recognition of human activities. We design a new two-stream multi-Siamese convolutional neural network. The idea is to explicitly capture the inherent property of low resolution (LR) videos that two images originated from the exact same scene often have totally different pixel values depending on their LR transformations. Our approach learns the shared embedding space that maps LR videos with the same content to the same location regardless of their transformations. We experimentally confirm that our approach of jointly learning such transform robust LR video representation and the classifier outperforms the previous state-of-the-art low resolution recognition approaches on two public standard datasets by a meaningful margin.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes