AIApr 21

Reinforcement Learning Improves LLM Accuracy and Reasoning in Disease Classification from Radiology Reports

arXiv:2604.1906014.6h-index: 8
Predicted impact top 43% in AI · last 90 daysOriginality Incremental advance
AI Analysis

For medical NLP practitioners, this method enhances both classification accuracy and reasoning in lightweight LLMs without requiring reasoning supervision.

A two-stage approach combining supervised fine-tuning and GRPO reinforcement learning improved disease classification accuracy and reasoning quality from radiology reports, outperforming baselines across three datasets.

Accurate disease classification from radiology reports is essential for many applications. While supervised fine-tuning (SFT) of lightweight LLMs improves accuracy, it can degrade reasoning. We propose a two-stage approach: SFT on disease labels followed by Group Relative Policy Optimization (GRPO) to refine predictions by optimizing accuracy and format without reasoning supervision. Across three radiologist-annotated datasets, SFT outperformed baselines and GRPO further improved classification and enhanced reasoning recall and comprehensiveness.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes