CLHCJun 4

Ouvia: A User-centered Framework for Measuring Usability of Speech Translation in Real-World Communication Scenarios

arXiv:2606.0617756.4
AI Analysis

This work addresses the need for user-centered evaluation of speech translation systems for end users in real-world communication, highlighting gaps in current holistic quality metrics.

The paper introduces Ouvia, a user-centered framework for evaluating speech translation usability in real-world communication scenarios. Results show that only about half of interactions are rated as usable, with significant demographic gaps, and that QA-based evaluation is a stronger predictor of usability than standard metrics.

Speech translation (ST) is increasingly adopted in user applications, yet its evaluation largely focuses on decontextualized testbeds and holistic quality, rather than end users' communication needs. We introduce Ouvia, an evaluation framework for measuring user-perceived usability of speech translation outputs in real-world settings. Ouvia focuses on one-to-one communication: an English speaker needs to convey a request to a Portuguese speaker, and the message is automatically translated. Through a custom web app and multi-phase study design, we collect more than 1,750 such interactions in healthcare and everyday situations, mediated by four ST systems, involving speakers from three English dialects and two genders. We find that modern ST serves people only to a limited extent -- only around half of interactions are rated as usable -- with significant gaps in reported usability across demographic groups. Moreover, among quality metrics, we find that QA-based evaluation is a substantially stronger predictor of real-world usability than standard approaches. Together, these findings stress the importance of situated, user-centered evaluation frameworks that go beyond holistic quality scores and attend to who the technology serves -- and how well.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes