CLSep 20, 2023

SignBank+: Preparing a Multilingual Sign Language Dataset for Machine Translation Using Large Language Models

arXiv:2309.11566v21 citationsh-index: 13
Originality Incremental advance
AI Analysis

This work addresses sign language translation for accessibility and research, but it is incremental as it builds on existing datasets and methods.

The authors tackled the problem of machine translation between spoken language text and SignWriting by introducing SignBank+, a cleaned version of the SignBank dataset, and showed that traditional text-to-text translation performs as effectively as complex methods, with models trained on SignBank+ surpassing those on the original dataset and establishing a new benchmark.

We introduce SignBank+, a clean version of the SignBank dataset, optimized for machine translation between spoken language text and SignWriting, a phonetic sign language writing system. In addition to previous work that employs complex factorization techniques to enable translation between text and SignWriting, we show that a traditional text-to-text translation approach performs equally effectively on the cleaned SignBank+ dataset. Our evaluation results indicate that models trained on SignBank+ surpass those on the original dataset, establishing a new benchmark for SignWriting-based sign language translation and providing an open resource for future research.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes