signwriting-evaluation: Effective Sign Language Evaluation via SignWriting
This work addresses the problem of evaluating transcription and translation models for signed languages, offering essential tools for researchers in sign language processing, though it is incremental as it adapts existing metrics to a new domain.
The paper tackles the lack of automatic evaluation metrics for SignWriting in sign language processing by introducing a suite of metrics, including adaptations of BLEU and chrF, CLIPScore for images, and a novel symbol distance metric, and provides qualitative analyses to reveal their strengths and limitations.
The lack of automatic evaluation metrics tailored for SignWriting presents a significant obstacle in developing effective transcription and translation models for signed languages. This paper introduces a comprehensive suite of evaluation metrics specifically designed for SignWriting, including adaptations of standard metrics such as \texttt{BLEU} and \texttt{chrF}, the application of \texttt{CLIPScore} to SignWriting images, and a novel symbol distance metric unique to our approach. We address the distinct challenges of evaluating single signs versus continuous signing and provide qualitative demonstrations of metric efficacy through score distribution analyses and nearest-neighbor searches within the SignBank corpus. Our findings reveal the strengths and limitations of each metric, offering valuable insights for future advancements using SignWriting. This work contributes essential tools for evaluating SignWriting models, facilitating progress in the field of sign language processing. Our code is available at \url{https://github.com/sign-language-processing/signwriting-evaluation}.