SeMaScore : a new evaluation metric for automatic speech recognition tasks
This provides a more efficient and reliable evaluation tool for speech recognition tasks, especially in real-world scenarios with atypical speech patterns, though it is incremental as it builds upon existing metrics like BERTScore.
The authors tackled the problem of evaluating automatic speech recognition systems by introducing SeMaScore, a new metric that combines error rate and similarity scores, which outperforms BERTScore by 41x in computation speed and aligns well with human assessments and other metrics.
In this study, we present SeMaScore, generated using a segment-wise mapping and scoring algorithm that serves as an evaluation metric for automatic speech recognition tasks. SeMaScore leverages both the error rate and a more robust similarity score. We show that our algorithm's score generation improves upon the state-of-the-art BERTScore. Our experimental results show that SeMaScore corresponds well with expert human assessments, signal-to-noise ratio levels, and other natural language metrics. We outperform BERTScore by 41x in metric computation speed. Overall, we demonstrate that SeMaScore serves as a more dependable evaluation metric, particularly in real-world situations involving atypical speech patterns.