Frequency Domain Enhanced U-Net for Low-Frequency Information-Rich Image Segmentation in Surgical and Deep-Sea Exploration Robots
This addresses segmentation challenges in robotics applications with lighting and resolution limitations, though it appears incremental as it builds on existing U-Net and backbone architectures.
The paper tackles the problem of high-frequency feature attenuation in image segmentation for surgical and deep-sea exploration robots by proposing a wavelet adaptive spectrum fusion method and perception frequency block to balance cross-frequency features, resulting in the FE-UNet model that achieves state-of-the-art performance in cross-domain tasks like marine organism and polyp segmentation.
In deep-sea exploration and surgical robotics scenarios, environmental lighting and device resolution limitations often cause high-frequency feature attenuation. Addressing the differences in frequency band sensitivity between CNNs and the human visual system (mid-frequency sensitivity with low-frequency sensitivity surpassing high-frequency), we experimentally quantified the CNN contrast sensitivity function and proposed a wavelet adaptive spectrum fusion (WASF) method inspired by biological vision mechanisms to balance cross-frequency image features. Furthermore, we designed a perception frequency block (PFB) that integrates WASF to enhance frequency-domain feature extraction. Based on this, we developed the FE-UNet model, which employs a SAM2 backbone network and incorporates fine-tuned Hiera-Large modules to ensure segmentation accuracy while improving generalization capability. Experiments demonstrate that FE-UNet achieves state-of-the-art performance in cross-domain tasks such as marine organism segmentation and polyp segmentation, showcasing robust adaptability and significant application potential. The code will be released soon.