Open Universal Arabic ASR Leaderboard
This work addresses the need for standardized benchmarking in the Arabic ASR community to assess model generalization across dialects, though it is incremental as it focuses on establishing an evaluation framework rather than proposing new methods.
The paper tackles the problem of evaluating Arabic automatic speech recognition (ASR) models across multiple dialects by introducing the Open Universal Arabic ASR Leaderboard, a continuous benchmark project that provides comprehensive analysis of model robustness, speaker adaptation, inference efficiency, and memory consumption.
In recent years, the enhanced capabilities of ASR models and the emergence of multi-dialect datasets have increasingly pushed Arabic ASR model development toward an all-dialect-in-one direction. This trend highlights the need for benchmarking studies that evaluate model performance on multiple dialects, providing the community with insights into models' generalization capabilities. In this paper, we introduce Open Universal Arabic ASR Leaderboard, a continuous benchmark project for open-source general Arabic ASR models across various multi-dialect datasets. We also provide a comprehensive analysis of the model's robustness, speaker adaptation, inference efficiency, and memory consumption. This work aims to offer the Arabic ASR community a reference for models' general performance and also establish a common evaluation framework for multi-dialectal Arabic ASR models.