CL LGDec 4, 2020

Benchmarking Automated Clinical Language Simplification: Dataset, Algorithm, and Evaluation

Junyu Luo, Zifei Zheng, Hanzhong Ye, Muchao Ye, Yaqing Wang, Quanzeng You, Cao Xiao, Fenglong Ma

arXiv:2012.02420v225.6584 citationsh-index: 43Has Code

Originality Incremental advance

AI Analysis

This work is significant for patients with low health literacy, as it aims to make medical information more accessible and understandable, addressing a gap in existing clinical language simplification research.

This paper addresses the challenge of simplifying complex clinical language for patients with low health literacy. It introduces MedLane, a new dataset for automated clinical language simplification, and DECLARE, a novel model that achieves state-of-the-art performance against eight strong baselines.

Patients with low health literacy usually have difficulty understanding medical jargon and the complex structure of professional medical language. Although some studies are proposed to automatically translate expert language into layperson-understandable language, only a few of them focus on both accuracy and readability aspects simultaneously in the clinical domain. Thus, simplification of the clinical language is still a challenging task, but unfortunately, it is not yet fully addressed in previous work. To benchmark this task, we construct a new dataset named MedLane to support the development and evaluation of automated clinical language simplification approaches. Besides, we propose a new model called DECLARE that follows the human annotation procedure and achieves state-of-the-art performance compared with eight strong baselines. To fairly evaluate the performance, we also propose three specific evaluation metrics. Experimental results demonstrate the utility of the annotated MedLane dataset and the effectiveness of the proposed model DECLARE.

View on arXiv PDF Code

Similar