CL LGDec 9, 2023

Enhancing Medical Specialty Assignment to Patients using NLP Techniques

arXiv:2312.05585v11 citationsh-index: 1NLPIR

Originality Incremental advance

AI Analysis

This work addresses the need for efficient and accurate medical specialty assignment, which is incremental as it builds on existing NLP methods by introducing a keyword-based approach.

The paper tackled the problem of automatically assigning medical specialties to patients using NLP, and found that using keywords for text classification significantly improved performance over traditional fine-tuned language models like PubMedBERT, with concrete gains in classification metrics.

The introduction of Large Language Models (LLMs), and the vast volume of publicly available medical data, amplified the application of NLP to the medical domain. However, LLMs are pretrained on data that are not explicitly relevant to the domain that are applied to and are often biased towards the original data they were pretrained upon. Even when pretrained on domainspecific data, these models typically require time-consuming fine-tuning to achieve good performance for a specific task. To address these limitations, we propose an alternative approach that achieves superior performance while being computationally efficient. Specifically, we utilize keywords to train a deep learning architecture that outperforms a language model pretrained on a large corpus of text. Our proposal does not require pretraining nor fine-tuning and can be applied directly to a specific setting for performing multi-label classification. Our objective is to automatically assign a new patient to the specialty of the medical professional they require, using a dataset that contains medical transcriptions and relevant keywords. To this end, we fine-tune the PubMedBERT model on this dataset, which serves as the baseline for our experiments. We then twice train/fine-tune a DNN and the RoBERTa language model, using both the keywords and the full transcriptions as input. We compare the performance of these approaches using relevant metrics. Our results demonstrate that utilizing keywords for text classification significantly improves classification performance, for both a basic DL architecture and a large language model. Our approach represents a promising and efficient alternative to traditional methods for finetuning language models on domain-specific data and has potential applications in various medical domains

View on arXiv PDF

Similar