CLSDASMLNov 20, 2017

Speech recognition for medical conversations

arXiv:1711.07274v296 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of accurate speech recognition for medical conversations, which could aid in clinical documentation, but it is incremental as it applies existing methods to a new domain-specific dataset.

The researchers tackled the problem of transcribing doctor-patient conversations by collecting a large-scale dataset of 14,000 hours of clinical conversations and exploring CTC and LAS speech recognition models, finding that LAS was more resilient to noisy data and the models performed well on important medical utterances but had errors in casual conversations.

In this work we explored building automatic speech recognition models for transcribing doctor patient conversation. We collected a large scale dataset of clinical conversations ($14,000$ hr), designed the task to represent the real word scenario, and explored several alignment approaches to iteratively improve data quality. We explored both CTC and LAS systems for building speech recognition models. The LAS was more resilient to noisy data and CTC required more data clean up. A detailed analysis is provided for understanding the performance for clinical tasks. Our analysis showed the speech recognition models performed well on important medical utterances, while errors occurred in causal conversations. Overall we believe the resulting models can provide reasonable quality in practice.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes