IVCVSep 20, 2023

More complex encoder is not all you need

arXiv:2309.11139v31 citationsh-index: 9Has Code
Originality Incremental advance
AI Analysis

This work addresses segmentation accuracy in medical imaging, particularly for 3D data, by enhancing decoder design, though it is incremental as it builds on existing U-Net frameworks.

The paper tackled the problem of 3D medical image segmentation by focusing on improving the decoder in U-Net variants, which often use complex encoders but neglect decoder functionality, and introduced neU-Net with a novel Sub-pixel Convolution for upsampling and a multi-scale wavelet inputs module, achieving state-of-the-art results on Synapse and ACDC datasets.

U-Net and its variants have been widely used in medical image segmentation. However, most current U-Net variants confine their improvement strategies to building more complex encoder, while leaving the decoder unchanged or adopting a simple symmetric structure. These approaches overlook the true functionality of the decoder: receiving low-resolution feature maps from the encoder and restoring feature map resolution and lost information through upsampling. As a result, the decoder, especially its upsampling component, plays a crucial role in enhancing segmentation outcomes. However, in 3D medical image segmentation, the commonly used transposed convolution can result in visual artifacts. This issue stems from the absence of direct relationship between adjacent pixels in the output feature map. Furthermore, plain encoder has already possessed sufficient feature extraction capability because downsampling operation leads to the gradual expansion of the receptive field, but the loss of information during downsampling process is unignorable. To address the gap in relevant research, we extend our focus beyond the encoder and introduce neU-Net (i.e., not complex encoder U-Net), which incorporates a novel Sub-pixel Convolution for upsampling to construct a powerful decoder. Additionally, we introduce multi-scale wavelet inputs module on the encoder side to provide additional information. Our model design achieves excellent results, surpassing other state-of-the-art methods on both the Synapse and ACDC datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes