ADS_UNet: A Nested UNet for Histopathology Image Segmentation
This work addresses segmentation challenges in medical imaging, specifically for histopathology, offering a more efficient and effective method, though it is incremental as it builds on existing UNet architectures.
The paper tackles the problem of feature diversity loss in nested UNet models for histopathology image segmentation by proposing ADS_UNet, a stage-wise additive training algorithm with resource-efficient deep supervision, which improves performance by 1.08 and 0.6 points on CRAG and BCSS datasets while reducing GPU consumption by 63% and training time by 66% compared to Transformer-based models.
The UNet model consists of fully convolutional network (FCN) layers arranged as contracting encoder and upsampling decoder maps. Nested arrangements of these encoder and decoder maps give rise to extensions of the UNet model, such as UNete and UNet++. Other refinements include constraining the outputs of the convolutional layers to discriminate between segment labels when trained end to end, a property called deep supervision. This reduces feature diversity in these nested UNet models despite their large parameter space. Furthermore, for texture segmentation, pixel correlations at multiple scales contribute to the classification task; hence, explicit deep supervision of shallower layers is likely to enhance performance. In this paper, we propose ADS UNet, a stage-wise additive training algorithm that incorporates resource-efficient deep supervision in shallower layers and takes performance-weighted combinations of the sub-UNets to create the segmentation model. We provide empirical evidence on three histopathology datasets to support the claim that the proposed ADS UNet reduces correlations between constituent features and improves performance while being more resource efficient. We demonstrate that ADS_UNet outperforms state-of-the-art Transformer-based models by 1.08 and 0.6 points on CRAG and BCSS datasets, and yet requires only 37% of GPU consumption and 34% of training time as that required by Transformers.