UXNet: Searching Multi-level Feature Aggregation for 3D Medical Image Segmentation
This work addresses the need for robust segmentation in medical imaging to aid diagnosis and treatment, offering a novel NAS approach that enhances flexibility and efficiency, though it is incremental in improving upon existing UNet-based methods.
The paper tackles the problem of 3D medical image segmentation by proposing UXNet, a neural architecture search method that searches for multi-level feature aggregation strategies and block-wise operators, which outperforms state-of-the-art models on public benchmarks with improved Dice scores, especially for boundary locations and tiny tissues.
Aggregating multi-level feature representation plays a critical role in achieving robust volumetric medical image segmentation, which is important for the auxiliary diagnosis and treatment. Unlike the recent neural architecture search (NAS) methods that typically searched the optimal operators in each network layer, but missed a good strategy to search for feature aggregations, this paper proposes a novel NAS method for 3D medical image segmentation, named UXNet, which searches both the scale-wise feature aggregation strategies as well as the block-wise operators in the encoder-decoder network. UXNet has several appealing benefits. (1) It significantly improves flexibility of the classical UNet architecture, which only aggregates feature representations of encoder and decoder in equivalent resolution. (2) A continuous relaxation of UXNet is carefully designed, enabling its searching scheme performed in an efficient differentiable manner. (3) Extensive experiments demonstrate the effectiveness of UXNet compared with recent NAS methods for medical image segmentation. The architecture discovered by UXNet outperforms existing state-of-the-art models in terms of Dice on several public 3D medical image segmentation benchmarks, especially for the boundary locations and tiny tissues. The searching computational complexity of UXNet is cheap, enabling to search a network with the best performance less than 1.5 days on two TitanXP GPUs.