Tamaththul3D: High-Fidelity 3D Saudi Sign Language Avatars from Monocular Video
This work addresses the lack of 3D annotations and reconstruction methods for Arabic Sign Language, enabling accessibility technologies and cultural preservation for the Arab Deaf community.
The paper introduces the first high-quality 3D parametric annotations for the Ishara-500 Saudi Sign Language dataset and presents Tamaththul3D, a reconstruction pipeline that achieves up to 32% improvement in hand accuracy over previous methods for Arabic Sign Language avatar generation.
Arabic Sign Language (ArSL) and its dialects serve approximately 400 million Arabic speakers worldwide, yet the community lacks high-quality 3D parametric annotations and specialized reconstruction methods for avatar generation. We address this critical gap through two key contributions: First, we introduce the first high-quality 3D parametric annotations for the Ishara-500 Saudi Sign Language dataset, providing precise SMPL-X parameters for 500 culturally authentic SSL signs. Second, we present Tamaththul3D, a specialized reconstruction pipeline designed for ArSL's unique articulation patterns. Our pipeline integrates SMPLer-X for robust body estimation, WiLoR for detailed hand refinement with automatic localization and mirroring, and MediaPipe for 2D pose supervision. Through kinematic-chain-based wrist alignment with hybrid swing-twist decomposition and 2D-supervised joint optimization, Tamaththul3D achieves state-of-the-art hand accuracy (up to 32% improvement over previous methods) while maintaining competitive body pose. Together, these 3D annotations and Tamaththul3D pipeline establish the first comprehensive framework for high-fidelity ArSL avatar reconstruction, enabling new accessibility technologies and cultural preservation efforts for the Arab Deaf community.