CLAINov 19, 2024

Enhancing Multi-Class Disease Classification: Neoplasms, Cardiovascular, Nervous System, and Digestive Disorders Using Advanced LLMs

arXiv:2411.12712v12 citationsh-index: 52024 27th International Conference on Computer and Information Technology (ICCIT)
Originality Synthesis-oriented
AI Analysis

This work addresses disease classification for medical professionals, but it is incremental as it applies existing models to a specific dataset without major methodological breakthroughs.

The study tackled multi-class disease classification using pre-trained language models on a medical abstracts dataset, finding that BioBERT achieved 97% accuracy and XLNet 96%, while a custom LastBERT model reached 87.10% accuracy.

In this research, we explored the improvement in terms of multi-class disease classification via pre-trained language models over Medical-Abstracts-TC-Corpus that spans five medical conditions. We excluded non-cancer conditions and examined four specific diseases. We assessed four LLMs, BioBERT, XLNet, and BERT, as well as a novel base model (Last-BERT). BioBERT, which was pre-trained on medical data, demonstrated superior performance in medical text classification (97% accuracy). Surprisingly, XLNet followed closely (96% accuracy), demonstrating its generalizability across domains even though it was not pre-trained on medical data. LastBERT, a custom model based on the lighter version of BERT, also proved competitive with 87.10% accuracy (just under BERT's 89.33%). Our findings confirm the importance of specialized models such as BioBERT and also support impressions around more general solutions like XLNet and well-tuned transformer architectures with fewer parameters (in this case, LastBERT) in medical domain tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes