CL AI LGAug 7, 2023

MedMine: Examining Pre-trained Language Models on Medication Mining

Haifa Alrdahi, Lifeng Han, Hendrik Šuvalov, Goran Nenadic

arXiv:2308.03629v21.37 citationsh-index: 35Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses obstacles in deploying automatic extraction models for healthcare applications, but it is incremental as it examines existing models without introducing new methods.

The study evaluated pre-trained language models like Med7 and XLM-RoBERTa for medication mining from clinical text, identifying imbalanced performance across entity types and events using n2c2-2018 datasets.

Automatic medication mining from clinical and biomedical text has become a popular topic due to its real impact on healthcare applications and the recent development of powerful language models (LMs). However, fully-automatic extraction models still face obstacles to be overcome such that they can be deployed directly into clinical practice for better impacts. Such obstacles include their imbalanced performances on different entity types and clinical events. In this work, we examine current state-of-the-art pre-trained language models (PLMs) on such tasks, via fine-tuning including the monolingual model Med7 and multilingual large language model (LLM) XLM-RoBERTa. We compare their advantages and drawbacks using historical medication mining shared task data sets from n2c2-2018 challenges. We report the findings we get from these fine-tuning experiments such that they can facilitate future research on addressing them, for instance, how to combine their outputs, merge such models, or improve their overall accuracy by ensemble learning and data augmentation. MedMine is part of the M3 Initiative \url{https://github.com/HECTA-UoM/M3}

View on arXiv PDF Code

Similar