A Prototype-Based Generalized Zero-Shot Learning Framework for Hand Gesture Recognition
This work addresses a domain-specific problem in hand gesture recognition for human-computer interaction, but it is incremental as it builds on existing GZSL methods.
The paper tackles the problem of recognizing hand gestures for both seen and unseen categories in human-computer interaction by proposing a prototype-based Generalized Zero-Shot Learning framework, and it demonstrates effectiveness on a newly established dataset.
Hand gesture recognition plays a significant role in human-computer interaction for understanding various human gestures and their intent. However, most prior works can only recognize gestures of limited labeled classes and fail to adapt to new categories. The task of Generalized Zero-Shot Learning (GZSL) for hand gesture recognition aims to address the above issue by leveraging semantic representations and detecting both seen and unseen class samples. In this paper, we propose an end-to-end prototype-based GZSL framework for hand gesture recognition which consists of two branches. The first branch is a prototype-based detector that learns gesture representations and determines whether an input sample belongs to a seen or unseen category. The second branch is a zero-shot label predictor which takes the features of unseen classes as input and outputs predictions through a learned mapping mechanism between the feature and the semantic space. We further establish a hand gesture dataset that specifically targets this GZSL task, and comprehensive experiments on this dataset demonstrate the effectiveness of our proposed approach on recognizing both seen and unseen gestures.