CVAIApr 13

Budget-Aware Uncertainty for Radiotherapy Segmentation QA Using nnU-Net

arXiv:2604.1179817.4h-index: 6
Predicted impact top 92% in CV · last 90 daysOriginality Synthesis-oriented
AI Analysis

For radiotherapy clinicians, this work provides a practical QA method to identify segmentation errors efficiently, though the approach is incremental as it applies existing techniques (ensembles, calibration) to a specific domain.

The paper proposes a budget-aware uncertainty-driven QA framework for radiotherapy CTV segmentation using nnU-Net, combining uncertainty quantification and post-hoc calibration. Temperature scaling improves calibration, and calibrated checkpoint ensembles achieve the best uncertainty-error alignment, enabling targeted manual review under realistic revision constraints.

Accurate delineation of the Clinical Target Volume (CTV) is essential for radiotherapy planning, yet remains time-consuming and difficult to assess, especially for complex treatments such as Total Marrow and Lymph Node Irradiation (TMLI). While deep learning-based auto-segmentation can reduce workload, safe clinical deployment requires reliable cues indicating where models may be wrong. In this work, we propose a budget-aware uncertainty-driven quality assurance (QA) framework built on nnU-Net, combining uncertainty quantification and post-hoc calibration to produce voxel-wise uncertainty maps (based on predictive entropy) that can guide targeted manual review. We compare temperature scaling (TS), deep ensembles (DE), checkpoint ensembles (CE), and test-time augmentation (TTA), evaluated both individually and in combination on TMLI as a representative use case. Reliability is assessed through ROI-masked calibration metrics and uncertainty--error alignment under realistic revision constraints, summarized as AUC over the top 0-5% most uncertain voxels. Across configurations, segmentation accuracy remains stable, whereas TS substantially improves calibration. Uncertainty-error alignment improves most with calibrated checkpoint-based inference, leading to uncertainty maps that highlight more consistently regions requiring manual edits. Overall, integrating calibration with efficient ensembling seems a promising strategy to implement a budget-aware QA workflow for radiotherapy segmentation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes