CVMay 6, 2021

Pose-Guided Sign Language Video GAN with Dynamic Lambda

Christopher Kissel, Christopher Kümmel, Dennis Ritter, Kristian Hildebrand

arXiv:2105.02742v13.710 citations

Originality Incremental advance

AI Analysis

This work addresses sign language video synthesis, which could aid communication for deaf and hard-of-hearing communities, but appears incremental as it builds directly on prior methods.

The paper tackles the problem of synthesizing sign language videos by extending previous GAN-based methods with pose guidance and a periodic weighting approach, achieving a SSIM of 0.893 on the MS-ASL dataset with over 200 signers.

We propose a novel approach for the synthesis of sign language videos using GANs. We extend the previous work of Stoll et al. by using the human semantic parser of the Soft-Gated Warping-GAN from to produce photorealistic videos guided by region-level spatial layouts. Synthesizing target poses improves performance on independent and contrasting signers. Therefore, we have evaluated our system with the highly heterogeneous MS-ASL dataset with over 200 signers resulting in a SSIM of 0.893. Furthermore, we introduce a periodic weighting approach to the generator that reactivates the training and leads to quantitatively better results.

View on arXiv PDF

Similar