CLSDASOct 24, 2022

Development of Hybrid ASR Systems for Low Resource Medical Domain Conversational Telephone Speech

arXiv:2210.13397v410 citationsh-index: 20
Originality Synthesis-oriented
AI Analysis

This work addresses communication barriers in medical settings for non-English speakers, but it is incremental as it builds on existing ASR methods for specific low-resource domains.

The paper tackles the challenge of developing automatic speech recognition (ASR) systems for low-resource medical conversational telephone speech to aid patient-physician communication, focusing on German-Arabic and German-Vietnamese language pairs, and addresses data collection, acoustic model architectures, and dialect difficulties.

Language barriers present a great challenge in our increasingly connected and global world. Especially within the medical domain, e.g. hospital or emergency room, communication difficulties and delays may lead to malpractice and non-optimal patient care. In the HYKIST project, we consider patient-physician communication, more specifically between a German-speaking physician and an Arabic- or Vietnamese-speaking patient. Currently, a doctor can call the Triaphon service to get assistance from an interpreter in order to help facilitate communication. The HYKIST goal is to support the usually non-professional bilingual interpreter with an automatic speech translation system to improve patient care and help overcome language barriers. In this work, we present our ASR system development efforts for this conversational telephone speech translation task in the medical domain for two languages pairs, data collection, various acoustic model architectures and dialect-induced difficulties.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes