Performance and Practical Considerations of Large and Small Language Models in Clinical Decision Support in Rheumatology
This addresses the need for efficient and accessible AI tools in resource-limited healthcare settings, though it is incremental as it builds on existing retrieval-augmented generation methods.
The study tackled the problem of using language models for clinical decision support in rheumatology, finding that smaller models with retrieval-augmented generation achieved higher diagnostic and therapeutic performance than larger models while using less energy and enabling cost-efficient local deployment, though none reached specialist-level accuracy.
Large language models (LLMs) show promise for supporting clinical decision-making in complex fields such as rheumatology. Our evaluation shows that smaller language models (SLMs), combined with retrieval-augmented generation (RAG), achieve higher diagnostic and therapeutic performance than larger models, while requiring substantially less energy and enabling cost-efficient, local deployment. These features are attractive for resource-limited healthcare. However, expert oversight remains essential, as no model consistently reached specialist-level accuracy in rheumatology.