Versatile Medical Image Segmentation Learned from Multi-Source Datasets via Model Self-Disambiguation
This work addresses the problem of high annotation costs for medical image segmentation, enabling more cost-effective model deployment across diverse equipment and protocols, though it is incremental in leveraging existing data.
The paper tackles the challenge of building versatile medical image segmentation models without requiring large, fully annotated datasets by proposing a method that uses multi-source data with partial labels, reducing annotation costs. Experimental results on an eight-source abdominal segmentation dataset show superior performance compared to state-of-the-art approaches.
A versatile medical image segmentation model applicable to images acquired with diverse equipment and protocols can facilitate model deployment and maintenance. However, building such a model typically demands a large, diverse, and fully annotated dataset, which is challenging to obtain due to the labor-intensive nature of data curation. To address this challenge, we propose a cost-effective alternative that harnesses multi-source data with only partial or sparse segmentation labels for training, substantially reducing the cost of developing a versatile model. We devise strategies for model self-disambiguation, prior knowledge incorporation, and imbalance mitigation to tackle challenges associated with inconsistently labeled multi-source data, including label ambiguity and modality, dataset, and class imbalances. Experimental results on a multi-modal dataset compiled from eight different sources for abdominal structure segmentation have demonstrated the effectiveness and superior performance of our method compared to state-of-the-art alternative approaches. We anticipate that its cost-saving features, which optimize the utilization of existing annotated data and reduce annotation efforts for new data, will have a significant impact in the field.