AI CL HCDec 23, 2025

Automated stereotactic radiosurgery planning using a human-in-the-loop reasoning large language model agent

Humza Nusrat, Luke Francisco, Bing Luo, Hassan Bagher-Ebadian, Joshua Kim, Karen Chin-Snyder, Salim Siddiqui, Mira Shah, Eric Mellon, Mohammad Ghassemi, Anthony Doemer, Benjamin Movsas

arXiv:2512.20586v13.3h-index: 4

Originality Incremental advance

AI Analysis

This addresses the problem of opaque AI systems in clinical settings by providing transparent, auditable planning for brain metastases treatment, though it is incremental as it builds on existing LLM methods.

The researchers tackled the challenge of automating stereotactic radiosurgery planning by developing an LLM-based agent with chain-of-thought reasoning, which achieved comparable dosimetry to human planners on key metrics (e.g., PTV coverage, p > 0.21) and reduced cochlear dose (p = 0.022) in a retrospective study of 41 patients.

Stereotactic radiosurgery (SRS) demands precise dose shaping around critical structures, yet black-box AI systems have limited clinical adoption due to opacity concerns. We tested whether chain-of-thought reasoning improves agentic planning in a retrospective cohort of 41 patients with brain metastases treated with 18 Gy single-fraction SRS. We developed SAGE (Secure Agent for Generative Dose Expertise), an LLM-based planning agent for automated SRS treatment planning. Two variants generated plans for each case: one using a non-reasoning model, one using a reasoning model. The reasoning variant showed comparable plan dosimetry relative to human planners on primary endpoints (PTV coverage, maximum dose, conformity index, gradient index; all p > 0.21) while reducing cochlear dose below human baselines (p = 0.022). When prompted to improve conformity, the reasoning model demonstrated systematic planning behaviors including prospective constraint verification (457 instances) and trade-off deliberation (609 instances), while the standard model exhibited none of these deliberative processes (0 and 7 instances, respectively). Content analysis revealed that constraint verification and causal explanation concentrated in the reasoning agent. The optimization traces serve as auditable logs, offering a path toward transparent automated planning.

View on arXiv PDF

Similar