CVCLApr 3, 2021

Fingerspelling Detection in American Sign Language

arXiv:2104.01291v127 citations
Originality Incremental advance
AI Analysis

This addresses a key bottleneck for real-world fingerspelling recognition systems, though it is incremental as it builds on prior work by adding detection to existing tasks.

The paper tackles the problem of detecting fingerspelling regions in untrimmed American Sign Language videos, proposing a new model that outperforms alternatives across all metrics and establishes state-of-the-art results on a new benchmark.

Fingerspelling, in which words are signed letter by letter, is an important component of American Sign Language. Most previous work on automatic fingerspelling recognition has assumed that the boundaries of fingerspelling regions in signing videos are known beforehand. In this paper, we consider the task of fingerspelling detection in raw, untrimmed sign language videos. This is an important step towards building real-world fingerspelling recognition systems. We propose a benchmark and a suite of evaluation metrics, some of which reflect the effect of detection on the downstream fingerspelling recognition task. In addition, we propose a new model that learns to detect fingerspelling via multi-task training, incorporating pose estimation and fingerspelling recognition (transcription) along with detection, and compare this model to several alternatives. The model outperforms all alternative approaches across all metrics, establishing a state of the art on the benchmark.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes