CVJul 24, 2019

Zero-Shot Sign Language Recognition: Can Textual Data Uncover Sign Languages?

arXiv:1907.10292v131 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of recognizing unseen sign language classes for accessibility applications, presenting a novel dataset and framework, though it is incremental in adapting existing zero-shot learning methods to a new domain.

The paper tackles zero-shot sign language recognition by using textual descriptions from dictionaries to transfer knowledge from seen to unseen signs, achieving results that demonstrate textual data's utility in uncovering sign languages.

We introduce the problem of zero-shot sign language recognition (ZSSLR), where the goal is to leverage models learned over the seen sign class examples to recognize the instances of unseen signs. To this end, we propose to utilize the readily available descriptions in sign language dictionaries as an intermediate-level semantic representation for knowledge transfer. We introduce a new benchmark dataset called ASL-Text that consists of 250 sign language classes and their accompanying textual descriptions. Compared to the ZSL datasets in other domains (such as object recognition), our dataset consists of limited number of training examples for a large number of classes, which imposes a significant challenge. We propose a framework that operates over the body and hand regions by means of 3D-CNNs, and models longer temporal relationships via bidirectional LSTMs. By leveraging the descriptive text embeddings along with these spatio-temporal representations within a zero-shot learning framework, we show that textual data can indeed be useful in uncovering sign languages. We anticipate that the introduced approach and the accompanying dataset will provide a basis for further exploration of this new zero-shot learning problem.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes