CLFeb 17, 2025

Exploring Large Language Models in Healthcare: Insights into Corpora Sources, Customization Strategies, and Evaluation Metrics

arXiv:2502.11861v13 citationsh-index: 15Has Code
Originality Synthesis-oriented
AI Analysis

It addresses challenges in applying LLMs to healthcare by identifying critical gaps in data and evaluation, which is incremental as it synthesizes existing research rather than proposing new methods.

This study reviewed the use of Large Language Models (LLMs) in healthcare, analyzing 61 articles to identify gaps in corpus fairness and biases, and highlighting the need for better integration of evidence-based guidelines and standardized evaluation frameworks.

This study reviewed the use of Large Language Models (LLMs) in healthcare, focusing on their training corpora, customization techniques, and evaluation metrics. A systematic search of studies from 2021 to 2024 identified 61 articles. Four types of corpora were used: clinical resources, literature, open-source datasets, and web-crawled data. Common construction techniques included pre-training, prompt engineering, and retrieval-augmented generation, with 44 studies combining multiple methods. Evaluation metrics were categorized into process, usability, and outcome metrics, with outcome metrics divided into model-based and expert-assessed outcomes. The study identified critical gaps in corpus fairness, which contributed to biases from geographic, cultural, and socio-economic factors. The reliance on unverified or unstructured data highlighted the need for better integration of evidence-based clinical guidelines. Future research should focus on developing a tiered corpus architecture with vetted sources and dynamic weighting, while ensuring model transparency. Additionally, the lack of standardized evaluation frameworks for domain-specific models called for comprehensive validation of LLMs in real-world healthcare settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes