CV CLFeb 18

Gloss-Free Sign Language Translation: An Unbiased Evaluation of Progress in the Field

Ozge Mercanoglu Sincan, Jian He Low, Sobhan Asasi, Richard Bowden

arXiv:2603.132403 citationsh-index: 4Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of unclear progress in sign language translation research for the field by providing an unbiased evaluation that reveals incremental contributions.

The paper re-implemented recent gloss-free sign language translation models in a unified codebase to assess the sources of performance improvements, finding that many reported gains diminish under consistent evaluation conditions, highlighting the impact of implementation details and evaluation setups.

Sign Language Translation (SLT) aims to automatically convert visual sign language videos into spoken language text and vice versa. While recent years have seen rapid progress, the true sources of performance improvements often remain unclear. Do reported performance gains come from methodological novelty, or from the choice of a different backbone, training optimizations, hyperparameter tuning, or even differences in the calculation of evaluation metrics? This paper presents a comprehensive study of recent gloss-free SLT models by re-implementing key contributions in a unified codebase. We ensure fair comparison by standardizing preprocessing, video encoders, and training setups across all methods. Our analysis shows that many of the performance gains reported in the literature often diminish when models are evaluated under consistent conditions, suggesting that implementation details and evaluation setups play a significant role in determining results. We make the codebase publicly available here (https://github.com/ozgemercanoglu/sltbaselines) to support transparency and reproducibility in SLT research.

View on arXiv PDF Code

Similar