MED-PH LGJun 13, 2025

Automated Treatment Planning for Interstitial HDR Brachytherapy for Locally Advanced Cervical Cancer using Deep Reinforcement Learning

Mohammadamin Moradi, Runyu Jiang, Yingzi Liu, Malvern Madondo, Tianming Wu, James J. Sohn, Xiaofeng Yang, Yasmin Hasan, Zhen Tian

arXiv:2506.11957v12 citationsh-index: 16

Originality Incremental advance

AI Analysis

This addresses the need for more consistent and efficient treatment planning for patients with locally advanced cervical cancer, though it is incremental as it builds on existing optimization methods.

The study tackled the problem of manual treatment planning for HDR brachytherapy in cervical cancer by developing an automated framework using deep reinforcement learning, which achieved an average score of 93.89% on test patients, outperforming clinical plans at 91.86%.

High-dose-rate (HDR) brachytherapy plays a critical role in the treatment of locally advanced cervical cancer but remains highly dependent on manual treatment planning expertise. The objective of this study is to develop a fully automated HDR brachytherapy planning framework that integrates reinforcement learning (RL) and dose-based optimization to generate clinically acceptable treatment plans with improved consistency and efficiency. We propose a hierarchical two-stage autoplanning framework. In the first stage, a deep Q-network (DQN)-based RL agent iteratively selects treatment planning parameters (TPPs), which control the trade-offs between target coverage and organ-at-risk (OAR) sparing. The agent's state representation includes both dose-volume histogram (DVH) metrics and current TPP values, while its reward function incorporates clinical dose objectives and safety constraints, including D90, V150, V200 for targets, and D2cc for all relevant OARs (bladder, rectum, sigmoid, small bowel, and large bowel). In the second stage, a customized Adam-based optimizer computes the corresponding dwell time distribution for the selected TPPs using a clinically informed loss function. The framework was evaluated on a cohort of patients with complex applicator geometries. The proposed framework successfully learned clinically meaningful TPP adjustments across diverse patient anatomies. For the unseen test patients, the RL-based automated planning method achieved an average score of 93.89%, outperforming the clinical plans which averaged 91.86%. These findings are notable given that score improvements were achieved while maintaining full target coverage and reducing CTV hot spots in most cases.

View on arXiv PDF

Similar