An Open-Source American Sign Language Fingerspell Recognition and Semantic Pose Retrieval Interface
This work addresses accessibility for deaf and hard-of-hearing individuals by providing a modular tool for sign language translation, though it appears incremental as a stepping stone towards more advanced systems.
The paper tackles the problem of American Sign Language fingerspelling translation by developing an open-source interface that recognizes ASL fingerspelling into English and produces ASL poses from English, aiming to function in real-time under diverse conditions.
This paper introduces an open-source interface for American Sign Language fingerspell recognition and semantic pose retrieval, aimed to serve as a stepping stone towards more advanced sign language translation systems. Utilizing a combination of convolutional neural networks and pose estimation models, the interface provides two modular components: a recognition module for translating ASL fingerspelling into spoken English and a production module for converting spoken English into ASL pose sequences. The system is designed to be highly accessible, user-friendly, and capable of functioning in real-time under varying environmental conditions like backgrounds, lighting, skin tones, and hand sizes. We discuss the technical details of the model architecture, application in the wild, as well as potential future enhancements for real-world consumer applications.