CL CYAug 12, 2025

Out of the Box, into the Clinic? Evaluating State-of-the-Art ASR for Clinical Applications for Older Adults

Bram van Dijk, Tiberon Kuiper, Sirin Aoulad si Ahmed, Armel Levebvre, Jake Johnson, Jan Duin, Simon Mooijaart, Marco Spruit

arXiv:2508.08684v34.92 citationsh-index: 4

Originality Synthesis-oriented

AI Analysis

This addresses the bottleneck of reliable ASR for underrepresented groups like older adults in clinical applications, but it is incremental as it benchmarks existing models on new data.

The study evaluated state-of-the-art ASR models on language use of older Dutch adults interacting with a clinical chatbot, finding that generic multilingual models outperformed fine-tuned ones and that truncating models helped balance accuracy and speed, with some inputs causing high word error rates.

Voice-controlled interfaces can support older adults in clinical contexts -- with chatbots being a prime example -- but reliable Automatic Speech Recognition (ASR) for underrepresented groups remains a bottleneck. This study evaluates state-of-the-art ASR models on language use of older Dutch adults, who interacted with the Welzijn.AI chatbot designed for geriatric contexts. We benchmark generic multilingual ASR models, and models fine-tuned for Dutch spoken by older adults, while also considering processing speed. Our results show that generic multilingual models outperform fine-tuned models, which suggests recent ASR models can generalise well out of the box to real-world datasets. Moreover, our results indicate that truncating generic models is helpful in balancing the accuracy-speed trade-off. Nonetheless, we also find inputs which cause a high word error rate and place them in context.

View on arXiv PDF

Similar