Deep Esophageal Clinical Target Volume Delineation using Encoded 3D Spatial Context of Tumors, Lymph Nodes, and Organs At Risk
This work addresses the challenge of reducing variability in radiotherapy planning for esophageal cancer patients, representing an incremental improvement with domain-specific enhancements.
The paper tackled the problem of automating clinical target volume delineation for esophageal cancer radiotherapy, which suffers from high variability, by using encoded spatial contexts of tumors, lymph nodes, and organs at risk, resulting in an average Dice score of 83.9% and surface distance of 4.2 mm, improving over the state-of-the-art by 3.8% and 2.4 mm.
Clinical target volume (CTV) delineation from radiotherapy computed tomography (RTCT) images is used to define the treatment areas containing the gross tumor volume (GTV) and/or sub-clinical malignant disease for radiotherapy (RT). High intra- and inter-user variability makes this a particularly difficult task for esophageal cancer. This motivates automated solutions, which is the aim of our work. Because CTV delineation is highly context-dependent--it must encompass the GTV and regional lymph nodes (LNs) while also avoiding excessive exposure to the organs at risk (OARs)--we formulate it as a deep contextual appearance-based problem using encoded spatial contexts of these anatomical structures. This allows the deep network to better learn from and emulate the margin- and appearance-based delineation performed by human physicians. Additionally, we develop domain-specific data augmentation to inject robustness to our system. Finally, we show that a simple 3D progressive holistically nested network (PHNN), which avoids computationally heavy decoding paths while still aggregating features at different levels of context, can outperform more complicated networks. Cross-validated experiments on a dataset of 135 esophageal cancer patients demonstrate that our encoded spatial context approach can produce concrete performance improvements, with an average Dice score of 83.9% and an average surface distance of 4.2 mm, representing improvements of 3.8% and 2.4 mm, respectively, over the state-of-the-art approach.