CVSep 25, 2024

Pose-Guided Fine-Grained Sign Language Video Generation

arXiv:2409.16709v18 citationsh-index: 11
Originality Incremental advance
AI Analysis

This addresses the need for high-quality sign language video generation for spreading and learning sign language, representing an incremental advancement in domain-specific human image synthesis.

The paper tackles the problem of generating sign language videos with distorted details and poor temporal consistency by proposing a Pose-Guided Motion Model (PGMM), which outperforms state-of-the-art methods in benchmark tests with visible improvements in details and consistency.

Sign language videos are an important medium for spreading and learning sign language. However, most existing human image synthesis methods produce sign language images with details that are distorted, blurred, or structurally incorrect. They also produce sign language video frames with poor temporal consistency, with anomalies such as flickering and abrupt detail changes between the previous and next frames. To address these limitations, we propose a novel Pose-Guided Motion Model (PGMM) for generating fine-grained and motion-consistent sign language videos. Firstly, we propose a new Coarse Motion Module (CMM), which completes the deformation of features by optical flow warping, thus transfering the motion of coarse-grained structures without changing the appearance; Secondly, we propose a new Pose Fusion Module (PFM), which guides the modal fusion of RGB and pose features, thus completing the fine-grained generation. Finally, we design a new metric, Temporal Consistency Difference (TCD) to quantitatively assess the degree of temporal consistency of a video by comparing the difference between the frames of the reconstructed video and the previous and next frames of the target video. Extensive qualitative and quantitative experiments show that our method outperforms state-of-the-art methods in most benchmark tests, with visible improvements in details and temporal consistency.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes