Phonology Recognition in American Sign Language
This work addresses sign language processing for ASL users by introducing a novel approach based on phonological properties, though it appears incremental as it builds on existing NLP and deep learning techniques.
The paper tackles the problem of recognizing phonological properties in American Sign Language (ASL) by using a pretrained deep model for 3D mesh reconstruction and training machine learning models to classify signs, achieving micro-averaged F1-scores of 58% for location class and 70% for sign type, compared to baselines of 35% and 39%.
Inspired by recent developments in natural language processing, we propose a novel approach to sign language processing based on phonological properties validated by American Sign Language users. By taking advantage of datasets composed of phonological data and people speaking sign language, we use a pretrained deep model based on mesh reconstruction to extract the 3D coordinates of the signers keypoints. Then, we train standard statistical and deep machine learning models in order to assign phonological classes to each temporal sequence of coordinates. Our paper introduces the idea of exploiting the phonological properties manually assigned by sign language users to classify videos of people performing signs by regressing a 3D mesh. We establish a new baseline for this problem based on the statistical distribution of 725 different signs. Our best-performing models achieve a micro-averaged F1-score of 58% for the major location class and 70% for the sign type using statistical and deep learning algorithms, compared to their corresponding baselines of 35% and 39%.