CVMar 15, 2024

Testing MediaPipe Holistic for Linguistic Analysis of Nonmanual Markers in Sign Languages

arXiv:2403.10367v24 citationsh-index: 2
AI Analysis

This work addresses the problem of accurate nonmanual marker analysis for sign language researchers, but it is incremental as it builds on prior proposals without new breakthroughs.

The researchers tested MediaPipe Holistic for tracking facial features in sign language videos to assess its reliability for linguistic analysis, finding it performed poorly for eyebrow movements, similar to an older method, and proposed correction models as a solution.

Advances in Deep Learning have made possible reliable landmark tracking of human bodies and faces that can be used for a variety of tasks. We test a recent Computer Vision solution, MediaPipe Holistic (MPH), to find out if its tracking of the facial features is reliable enough for a linguistic analysis of data from sign languages, and compare it to an older solution (OpenFace, OF). We use an existing data set of sentences in Kazakh-Russian Sign Language and a newly created small data set of videos with head tilts and eyebrow movements. We find that MPH does not perform well enough for linguistic analysis of eyebrow movement - but in a different way from OF, which is also performing poorly without correction. We reiterate a previous proposal to train additional correction models to overcome these limitations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes