CLAILGJul 11, 2023

ISLTranslate: Dataset for Translating Indian Sign Language

arXiv:2307.05440v1223 citationsh-index: 24
Originality Synthesis-oriented
AI Analysis

This addresses the communication gap for the hard-of-hearing community in India by providing a dataset for developing translation systems, though it is incremental as it extends existing dataset efforts to a new language.

The paper tackles the lack of sign language resources for Indian Sign Language (ISL) by introducing ISLTranslate, a dataset of 31k ISL-English sentence/phrase pairs, which is the largest for continuous ISL, and benchmarks it with a transformer-based model for translation.

Sign languages are the primary means of communication for many hard-of-hearing people worldwide. Recently, to bridge the communication gap between the hard-of-hearing community and the rest of the population, several sign language translation datasets have been proposed to enable the development of statistical sign language translation systems. However, there is a dearth of sign language resources for the Indian sign language. This resource paper introduces ISLTranslate, a translation dataset for continuous Indian Sign Language (ISL) consisting of 31k ISL-English sentence/phrase pairs. To the best of our knowledge, it is the largest translation dataset for continuous Indian Sign Language. We provide a detailed analysis of the dataset. To validate the performance of existing end-to-end Sign language to spoken language translation systems, we benchmark the created dataset with a transformer-based model for ISL translation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes