CVJun 23, 2017

Multiresolution Match Kernels for Gesture Video Classification

arXiv:1706.07530v14 citations
Originality Synthesis-oriented
AI Analysis

This work addresses gesture classification for applications like sign language recognition, but it appears incremental as it builds upon the BoF method without claiming major breakthroughs.

The authors tackled the problem of poor similarity measures and loss of spatio-temporal information in gesture video classification using Bag-of-Features by proposing a Multiresolution Match Kernel as a generalization of BoF. Their results on ASL hand gesture classification with RGB-D videos show promise and usefulness of the new method, though no concrete numbers are provided.

The emergence of depth imaging technologies like the Microsoft Kinect has renewed interest in computational methods for gesture classification based on videos. For several years now, researchers have used the Bag-of-Features (BoF) as a primary method for generation of feature vectors from video data for recognition of gestures. However, the BoF method is a coarse representation of the information in a video, which often leads to poor similarity measures between videos. Besides, when features extracted from different spatio-temporal locations in the video are pooled to create histogram vectors in the BoF method, there is an intrinsic loss of their original locations in space and time. In this paper, we propose a new Multiresolution Match Kernel (MMK) for video classification, which can be considered as a generalization of the BoF method. We apply this procedure to hand gesture classification based on RGB-D videos of the American Sign Language(ASL) hand gestures and our results show promise and usefulness of this new method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes