CVAug 18, 2025

Real-Time Sign Language Gestures to Speech Transcription using Deep Learning

arXiv:2508.12713v13 citations
Originality Synthesis-oriented
AI Analysis

It addresses communication barriers for individuals with hearing and speech impairments, though it appears incremental as it applies existing methods to a specific dataset.

The paper tackles real-time translation of sign language gestures to speech using CNNs trained on Sign Language MNIST, achieving high accuracy and robust performance with some latency.

Communication barriers pose significant challenges for individuals with hearing and speech impairments, often limiting their ability to effectively interact in everyday environments. This project introduces a real-time assistive technology solution that leverages advanced deep learning techniques to translate sign language gestures into textual and audible speech. By employing convolution neural networks (CNN) trained on the Sign Language MNIST dataset, the system accurately classifies hand gestures captured live via webcam. Detected gestures are instantaneously translated into their corresponding meanings and transcribed into spoken language using text-to-speech synthesis, thus facilitating seamless communication. Comprehensive experiments demonstrate high model accuracy and robust real-time performance with some latency, highlighting the system's practical applicability as an accessible, reliable, and user-friendly tool for enhancing the autonomy and integration of sign language users in diverse social settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes