CVJul 27, 2025

Indian Sign Language Detection for Real-Time Translation using Machine Learning

Rajat Singhal, Jatin Gupta, Akhil Sharma, Anushka Gupta, Navya Sharma

arXiv:2507.20414v23.69 citationsh-index: 1RAIT

Originality Synthesis-oriented

AI Analysis

This addresses communication barriers for deaf and hard-of-hearing communities in India, where technological solutions are underdeveloped, though it is incremental as it applies existing methods to a new dataset.

The research tackled real-time translation for Indian Sign Language (ISL) by developing a CNN-based detection system, achieving a classification accuracy of 99.95%.

Gestural language is used by deaf & mute communities to communicate through hand gestures & body movements that rely on visual-spatial patterns known as sign languages. Sign languages, which rely on visual-spatial patterns of hand gestures & body movements, are the primary mode of communication for deaf & mute communities worldwide. Effective communication is fundamental to human interaction, yet individuals in these communities often face significant barriers due to a scarcity of skilled interpreters & accessible translation technologies. This research specifically addresses these challenges within the Indian context by focusing on Indian Sign Language (ISL). By leveraging machine learning, this study aims to bridge the critical communication gap for the deaf & hard-of-hearing population in India, where technological solutions for ISL are less developed compared to other global sign languages. We propose a robust, real-time ISL detection & translation system built upon a Convolutional Neural Network (CNN). Our model is trained on a comprehensive ISL dataset & demonstrates exceptional performance, achieving a classification accuracy of 99.95%. This high precision underscores the model's capability to discern the nuanced visual features of different signs. The system's effectiveness is rigorously evaluated using key performance metrics, including accuracy, F1 score, precision & recall, ensuring its reliability for real-world applications. For real-time implementation, the framework integrates MediaPipe for precise hand tracking & motion detection, enabling seamless translation of dynamic gestures. This paper provides a detailed account of the model's architecture, the data preprocessing pipeline & the classification methodology. The research elaborates the model architecture, preprocessing & classification methodologies for enhancing communication in deaf & mute communities.

View on arXiv PDF

Similar