AI CLDec 20, 2024

Benchmarking LLMs and SLMs for patient reported outcomes

Matteo Marengo, Jarod Lévy, Jean-Emmanuel Bibault

arXiv:2412.16291v12.3h-index: 2

Originality Synthesis-oriented

AI Analysis

This work addresses the need for efficient and privacy-compliant AI solutions in healthcare by comparing SLMs and LLMs for summarizing patient data, though it is incremental as it builds on existing LLM research.

The study benchmarked small language models (SLMs) against large language models (LLMs) for summarizing patient-reported outcomes in radiotherapy, finding that SLMs show promise for privacy-preserving healthcare but have limitations in high-stakes medical tasks.

LLMs have transformed the execution of numerous tasks, including those in the medical domain. Among these, summarizing patient-reported outcomes (PROs) into concise natural language reports is of particular interest to clinicians, as it enables them to focus on critical patient concerns and spend more time in meaningful discussions. While existing work with LLMs like GPT-4 has shown impressive results, real breakthroughs could arise from leveraging SLMs as they offer the advantage of being deployable locally, ensuring patient data privacy and compliance with healthcare regulations. This study benchmarks several SLMs against LLMs for summarizing patient-reported Q\&A forms in the context of radiotherapy. Using various metrics, we evaluate their precision and reliability. The findings highlight both the promise and limitations of SLMs for high-stakes medical tasks, fostering more efficient and privacy-preserving AI-driven healthcare solutions.

View on arXiv PDF

Similar