CV CL IVJul 28, 2025

The Importance of Facial Features in Vision-based Sign Language Recognition: Eyes, Mouth or Full Face?

arXiv:2507.20884v26.24 citationsh-index: 22IVA

Originality Incremental advance

AI Analysis

This work addresses the need for better automatic sign language recognition systems, but it is incremental as it builds on prior research by systematically comparing facial regions.

The study tackled the problem of understanding which facial features are most important for automatic sign language recognition, finding that the mouth is the most significant non-manual feature and improves accuracy.

Non-manual facial features play a crucial role in sign language communication, yet their importance in automatic sign language recognition (ASLR) remains underexplored. While prior studies have shown that incorporating facial features can improve recognition, related work often relies on hand-crafted feature extraction and fails to go beyond the comparison of manual features versus the combination of manual and facial features. In this work, we systematically investigate the contribution of distinct facial regionseyes, mouth, and full faceusing two different deep learning models (a CNN-based model and a transformer-based model) trained on an SLR dataset of isolated signs with randomly selected classes. Through quantitative performance and qualitative saliency map evaluation, we reveal that the mouth is the most important non-manual facial feature, significantly improving accuracy. Our findings highlight the necessity of incorporating facial features in ASLR.

View on arXiv PDF

Similar