CV AIMar 30, 2023

The impact of training dataset size and ensemble inference strategies on head and neck auto-segmentation

Edward G. A. Henderson, Marcel van Herk, Eliana M. Vasquez Osorio

arXiv:2303.17318v12.85 citationsh-index: 39

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of limited data for training robust auto-segmentation models in medical imaging, offering practical solutions for clinical applications where large datasets are scarce.

The study investigated how training dataset size and ensemble inference strategies affect the accuracy of head and neck auto-segmentation models in radiotherapy, finding that performance improved up to 250 scans and ensemble methods significantly boosted results, especially for small datasets.

Convolutional neural networks (CNNs) are increasingly being used to automate segmentation of organs-at-risk in radiotherapy. Since large sets of highly curated data are scarce, we investigated how much data is required to train accurate and robust head and neck auto-segmentation models. For this, an established 3D CNN was trained from scratch with different sized datasets (25-1000 scans) to segment the brainstem, parotid glands and spinal cord in CTs. Additionally, we evaluated multiple ensemble techniques to improve the performance of these models. The segmentations improved with training set size up to 250 scans and the ensemble methods significantly improved performance for all organs. The impact of the ensemble methods was most notable in the smallest datasets, demonstrating their potential for use in cases where large training datasets are difficult to obtain.

View on arXiv PDF

Similar