CLMar 18, 2014

Sign Language Gibberish for syntactic parsing evaluation

arXiv:1403.4473v12 citations

Originality Synthesis-oriented

AI Analysis

This addresses a data shortage problem for researchers in sign language processing, though it is an incremental solution focused on evaluation rather than parsing itself.

The paper tackles the lack of data for evaluating syntactic parsers in Sign Language (SL) by proposing a method to generate synthetic datasets, which helps assess scalability on large models.

Sign Language (SL) automatic processing slowly progresses bottom-up. The field has seen proposition to handle the video signal, to recognize and synthesize sublexical and lexical units. It starts to see the development of supra-lexical processing. But the recognition, at this level, lacks data. The syntax of SL appears very specific as it uses massively the multiplicity of articulators and its access to the spatial dimensions. Therefore new parsing techniques are developed. However these need to be evaluated. The shortage on real data restrains the corpus-based models to small sizes. We propose here a solution to produce data-sets for the evaluation of parsers on the specific properties of SL. The article first describes the general model used to generates dependency grammars and the phrase generation from these lasts. It then discusses the limits of approach. The solution shows to be of particular interest to evaluate the scalability of the techniques on big models.

View on arXiv PDF

Similar