Assessing Latency in ASR Systems: A Methodological Perspective for Real-Time Use
This addresses latency issues for interpreters in sensitive settings like diplomatic meetings, but appears incremental as it focuses on measurement methodology.
The paper tackles the problem of ASR system latency not aligning with real-time interpretation needs by proposing a new method to measure delay and validating usability in live scenarios, but does not provide concrete numerical results.
Automatic speech recognition (ASR) systems generate real-time transcriptions but often miss nuances that human interpreters capture. While ASR is useful in many contexts, interpreters-who already use ASR tools such as Dragon-add critical value, especially in sensitive settings such as diplomatic meetings where subtle language is key. Human interpreters not only perceive these nuances but can adjust in real time, improving accuracy, while ASR handles basic transcription tasks. However, ASR systems introduce a delay that does not align with real-time interpretation needs. The user-perceived latency of ASR systems differs from that of interpretation because it measures the time between speech and transcription delivery. To address this, we propose a new approach to measuring delay in ASR systems and validate if they are usable in live interpretation scenarios.