CLNEJun 5

LLM-Guided Evolution for Medical Decision Pipelines

arXiv:2606.0734231.0Has Code
Originality Incremental advance
AI Analysis

For medical AI practitioners, this work provides a zero-shot, fine-tuning-free approach to optimize LLM-based clinical pipelines, though the gains are demonstrated on specific benchmarks.

The paper introduces LLM-guided MAP-Elites evolution as an inference-time method to automatically discover medical decision strategies across triage, consultation, and image classification tasks, achieving improvements over manually designed baselines (e.g., Semigran accuracy from 77.3% to 87.1%, emergency recall from 0.60 to 0.97).

Adapting large language models (LLMs) to clinical workflows often requires costly fine-tuning or manual prompt and pipeline engineering. We study LLM-guided MAP-Elites evolution as an inference-time alternative for discovering medical decision strategies and provide an implementation repository at https://github.com/univanxx/llm_guided_evo_medical. We formulate urgency triage, interactive consultation, and medical image classification as evolutionary searches over executable artifacts optimized by task-specific fitness functions. Across all three settings, evolution improves over manually designed baselines under practical constraints. In triage, evolved programs increase Semigran accuracy from $77.3\%$ to $87.1\%$ and emergency recall from $0.60$ to $0.97$, while improving safety-weighted held-out MIMIC-ESI performance. In interactive consultation, evolved policies improve the accuracy--cost frontier across Llama-3, Qwen-3.5, and Gemma-4 and transfer to held-out iCRAFTMD. In PneumoniaMNIST, prompt-only evolution improves frozen MedGemma VLMs while preserving strict JSON outputs. Qualitative analysis shows that the gains come from interpretable program-level mechanisms, calibrated triage boundaries, targeted evidence acquisition, selective commitment, and finding-oriented visual decision rules, rather than superficial prompt rewording alone.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes