CVAIRODec 26, 2020

Skeleton-DML: Deep Metric Learning for Skeleton-Based One-Shot Action Recognition

arXiv:2012.13823v244 citations
AI Analysis

This work provides an incremental improvement in one-shot action recognition, which could benefit human-robot interaction by enabling robots to recognize novel human behaviors.

This paper addresses one-shot action recognition by formulating it as a deep metric learning problem and proposing a novel image-based skeleton representation. Their approach achieves a 3.3% improvement over the state-of-the-art on the NTU RGB+D 120 dataset, with an additional 7.7% improvement using augmentation.

One-shot action recognition allows the recognition of human-performed actions with only a single training example. This can influence human-robot-interaction positively by enabling the robot to react to previously unseen behaviour. We formulate the one-shot action recognition problem as a deep metric learning problem and propose a novel image-based skeleton representation that performs well in a metric learning setting. Therefore, we train a model that projects the image representations into an embedding space. In embedding space the similar actions have a low euclidean distance while dissimilar actions have a higher distance. The one-shot action recognition problem becomes a nearest-neighbor search in a set of activity reference samples. We evaluate the performance of our proposed representation against a variety of other skeleton-based image representations. In addition, we present an ablation study that shows the influence of different embedding vector sizes, losses and augmentation. Our approach lifts the state-of-the-art by 3.3% for the one-shot action recognition protocol on the NTU RGB+D 120 dataset under a comparable training setup. With additional augmentation our result improved over 7.7%.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes