CVCLSep 3, 2024

The NGT200 Dataset: Geometric Multi-View Isolated Sign Recognition

arXiv:2409.15284v12 citationsh-index: 24
Originality Synthesis-oriented
AI Analysis

This work addresses sign language processing for inclusivity, though it is incremental with a domain-specific focus.

The paper tackled multi-view isolated sign recognition by introducing the NGT200 dataset and using an SE(2) equivariant model, which improved performance by 8%-22% over the baseline.

Sign Language Processing (SLP) provides a foundation for a more inclusive future in language technology; however, the field faces several significant challenges that must be addressed to achieve practical, real-world applications. This work addresses multi-view isolated sign recognition (MV-ISR), and highlights the essential role of 3D awareness and geometry in SLP systems. We introduce the NGT200 dataset, a novel spatio-temporal multi-view benchmark, establishing MV-ISR as distinct from single-view ISR (SV-ISR). We demonstrate the benefits of synthetic data and propose conditioning sign representations on spatial symmetries inherent in sign language. Leveraging an SE(2) equivariant model improves MV-ISR performance by 8%-22% over the baseline.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes