CLAISep 13, 2024

Contextual Evaluation of Large Language Models for Classifying Tropical and Infectious Diseases

arXiv:2409.09201v35 citationsh-index: 11Has Code
AI Analysis

This work addresses the need for better LLM applications in tropical and infectious disease classification, though it is incremental by building on existing datasets and methods.

The study evaluated large language models (LLMs) for classifying tropical and infectious diseases using an expanded dataset of over 11,000 prompts, showing that contextual information like demographics and risk factors improves LLM performance compared to human experts.

While large language models (LLMs) have shown promise for medical question answering, there is limited work focused on tropical and infectious disease-specific exploration. We build on an opensource tropical and infectious diseases (TRINDs) dataset, expanding it to include demographic and semantic clinical and consumer augmentations yielding 11000+ prompts. We evaluate LLM performance on these, comparing generalist and medical LLMs, as well as LLM outcomes to human experts. We demonstrate through systematic experimentation, the benefit of contextual information such as demographics, location, gender, risk factors for optimal LLM response. Finally we develop a prototype of TRINDs-LM, a research tool that provides a playground to navigate how context impacts LLM outputs for health.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes