ASAICLSDMay 27, 2025

PSRB: A Comprehensive Benchmark for Evaluating Persian ASR Systems

arXiv:2505.21230v12 citationsh-index: 1Has Code
Originality Synthesis-oriented
AI Analysis

This provides a benchmark for advancing ASR research in Persian and offers a framework for other low-resource languages, though it is incremental as it adapts existing evaluation concepts to a specific domain.

The paper tackles the challenge of evaluating ASR systems for low-resource languages like Persian by introducing the Persian Speech Recognition Benchmark (PSRB), which includes diverse linguistic and acoustic conditions. Results show that while ASR models perform well on standard Persian, they struggle with regional accents, children's speech, and specific linguistic challenges.

Although Automatic Speech Recognition (ASR) systems have become an integral part of modern technology, their evaluation remains challenging, particularly for low-resource languages such as Persian. This paper introduces Persian Speech Recognition Benchmark(PSRB), a comprehensive benchmark designed to address this gap by incorporating diverse linguistic and acoustic conditions. We evaluate ten ASR systems, including state-of-the-art commercial and open-source models, to examine performance variations and inherent biases. Additionally, we conduct an in-depth analysis of Persian ASR transcriptions, identifying key error types and proposing a novel metric that weights substitution errors. This metric enhances evaluation robustness by reducing the impact of minor and partial errors, thereby improving the precision of performance assessment. Our findings indicate that while ASR models generally perform well on standard Persian, they struggle with regional accents, children's speech, and specific linguistic challenges. These results highlight the necessity of fine-tuning and incorporating diverse, representative training datasets to mitigate biases and enhance overall ASR performance. PSRB provides a valuable resource for advancing ASR research in Persian and serves as a framework for developing benchmarks in other low-resource languages. A subset of the PSRB dataset is publicly available at https://huggingface.co/datasets/PartAI/PSRB.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes