MED-PHCVFeb 26

Automated Dose-Based Anatomic Region Classification of Radiotherapy Treatment for Big Data Applications

arXiv:2602.23536v1h-index: 4
Originality Incremental advance
AI Analysis

This work provides a scalable and standardized solution for curating large radiotherapy planning databases, enabling more reliable downstream analyses for researchers and clinicians.

The authors developed software to automatically classify radiotherapy treatment plans by anatomic region, addressing the challenge of curating large, multi-institutional datasets. Their method infers regions from dose-volume overlap with deep-learning segmentations, achieving 95% Top-1 Accuracy and 91% Exact Accuracy on a 100-plan test dataset.

Curation is a significant barrier to using 'big data' radiotherapy planning databases of 100,000+ patients. Anatomic site stratification is essential for downstream analyses, but current methods rely on inconsistent plan labels or target nomenclature, which is unreliable for multi-institutional data. We developed software to automate labeling by inferring anatomic regions directly from dose-volume overlap with deep-learning segmentations, eliminating metadata reliance. The software processes DICOM files in bulk, utilizing deep learning to segment 118 structures (organs, glands, and bones) categorized into six regions: Cranial, Head and Neck, Pelvis, Abdomen, Thorax, Extremity. The 85% and 50% isodose lines are converted to structures to compute organ-specific dose-overlap metrics. Plans are assigned ranked regional labels based on these intersections. The algorithm was refined using 109 expert-labeled cases and validated on 100 consecutive clinical plans. On the 100-plan test dataset, the algorithm achieved 91% Exact Accuracy (matching all expert labels and order), 94% Top-2 Accuracy (matching the top two expert regions regardless of order), and 95% Top-1 Accuracy (matching the primary expert label). The automated workflow demonstrated high accuracy and robustness. The 95% Top-1 Accuracy is particularly significant, as it enables reliable querying of plans based on the primary treatment site. Detailed analysis of the few mismatched cases showed most were treated areas at the border between anatomic regions and were ambiguous between these two regions in a common-sense interpretation. This algorithm provides a scalable, standardized solution for curating the large, multi-institutional datasets required for 'big data' in radiotherapy and provides an important complement to text-based approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes