CLFeb 12, 2024

Unsupervised Sign Language Translation and Generation

Tencent
arXiv:2402.07726v128 citationsh-index: 27ACL
Originality Incremental advance
AI Analysis

This addresses the problem of sign language translation for accessibility, but it is incremental as it adapts unsupervised neural machine translation methods to a cross-modality setting.

The paper tackles unsupervised sign language translation and generation by introducing USLNet, which learns from single-modality text and video data without parallel data, achieving competitive results compared to supervised baselines on datasets like BOBSL and OpenASL.

Motivated by the success of unsupervised neural machine translation (UNMT), we introduce an unsupervised sign language translation and generation network (USLNet), which learns from abundant single-modality (text and video) data without parallel sign language data. USLNet comprises two main components: single-modality reconstruction modules (text and video) that rebuild the input from its noisy version in the same modality and cross-modality back-translation modules (text-video-text and video-text-video) that reconstruct the input from its noisy version in the different modality using back-translation procedure.Unlike the single-modality back-translation procedure in text-based UNMT, USLNet faces the cross-modality discrepancy in feature representation, in which the length and the feature dimension mismatch between text and video sequences. We propose a sliding window method to address the issues of aligning variable-length text with video sequences. To our knowledge, USLNet is the first unsupervised sign language translation and generation model capable of generating both natural language text and sign language video in a unified manner. Experimental results on the BBC-Oxford Sign Language dataset (BOBSL) and Open-Domain American Sign Language dataset (OpenASL) reveal that USLNet achieves competitive results compared to supervised baseline models, indicating its effectiveness in sign language translation and generation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes