LGMar 4, 2025

A Transformer-Based Framework for Greek Sign Language Production using Extended Skeletal Motion Representations

Chrysa Pratikaki, Panagiotis Filntisis, Athanasios Katsamanis, Anastasios Roussos, Petros Maragos

arXiv:2503.02421v14.1h-index: 24

Originality Incremental advance

AI Analysis

This work addresses communication barriers for the Deaf and Hard-of-Hearing community in Greece by providing a first attempt at Greek Sign Language Production, though it is incremental as it builds on existing methods.

The authors tackled the problem of translating spoken language into Greek Sign Language by proposing a transformer-based model that converts text to human pose keypoints and vice versa, achieving enhanced video quality through components like data-driven gloss generation and a scheduling algorithm.

Sign Languages are the primary form of communication for Deaf communities across the world. To break the communication barriers between the Deaf and Hard-of-Hearing and the hearing communities, it is imperative to build systems capable of translating the spoken language into sign language and vice versa. Building on insights from previous research, we propose a deep learning model for Sign Language Production (SLP), which to our knowledge is the first attempt on Greek SLP. We tackle this task by utilizing a transformer-based architecture that enables the translation from text input to human pose keypoints, and the opposite. We evaluate the effectiveness of the proposed pipeline on the Greek SL dataset Elementary23, through a series of comparative analyses and ablation studies. Our pipeline's components, which include data-driven gloss generation, training through video to text translation and a scheduling algorithm for teacher forcing - auto-regressive decoding seem to actively enhance the quality of produced SL videos.

View on arXiv PDF

Similar