Prompting with Sign Parameters for Low-resource Sign Language Instruction Generation
This addresses the need for accessible sign language learning tools for under-resourced communities like Bengali speakers, though it's an incremental improvement on existing prompting methods.
The authors tackled the problem of generating sign language instructions for under-resourced languages by creating the first Bengali dataset (BdSLIG) and introducing Sign Parameter-Infused prompting, which improved zero-shot performance by incorporating structured sign parameters like hand shape and motion into prompts.
Sign Language (SL) enables two-way communication for the deaf and hard-of-hearing community, yet many sign languages remain under-resourced in the AI space. Sign Language Instruction Generation (SLIG) produces step-by-step textual instructions that enable non-SL users to imitate and learn SL gestures, promoting two-way interaction. We introduce BdSLIG, the first Bengali SLIG dataset, used to evaluate Vision Language Models (VLMs) (i) on under-resourced SLIG tasks, and (ii) on long-tail visual concepts, as Bengali SL is unlikely to appear in the VLM pre-training data. To enhance zero-shot performance, we introduce Sign Parameter-Infused (SPI) prompting, which integrates standard SL parameters, like hand shape, motion, and orientation, directly into the textual prompts. Subsuming standard sign parameters into the prompt makes the instructions more structured and reproducible than free-form natural text from vanilla prompting. We envision that our work would promote inclusivity and advancement in SL learning systems for the under-resourced communities.