CL AIApr 23, 2025

Comparing Large Language Models and Traditional Machine Translation Tools for Translating Medical Consultation Summaries: A Pilot Study

Andy Li, Wei Zhou, Rashina Hoda, Chris Bain, Peter Poon

arXiv:2504.16601v14.91 citationsh-index: 34

Originality Synthesis-oriented

AI Analysis

This addresses the problem of accurate medical translation for clinicians and patients, but it is incremental as it builds on existing translation methods with a pilot study.

This study compared large language models (LLMs) and traditional machine translation tools for translating medical consultation summaries into Arabic, Chinese, and Vietnamese, finding that traditional tools generally performed better, especially for complex texts, while LLMs showed promise for simpler summaries in Vietnamese and Chinese.

This study evaluates how well large language models (LLMs) and traditional machine translation (MT) tools translate medical consultation summaries from English into Arabic, Chinese, and Vietnamese. It assesses both patient, friendly and clinician, focused texts using standard automated metrics. Results showed that traditional MT tools generally performed better, especially for complex texts, while LLMs showed promise, particularly in Vietnamese and Chinese, when translating simpler summaries. Arabic translations improved with complexity due to the language's morphology. Overall, while LLMs offer contextual flexibility, they remain inconsistent, and current evaluation metrics fail to capture clinical relevance. The study highlights the need for domain-specific training, improved evaluation methods, and human oversight in medical translation.

View on arXiv PDF

Similar