IV CVJan 30

Vision-Language Controlled Deep Unfolding for Joint Medical Image Restoration and Segmentation

Ping Chen, Zicheng Huang, Xiangming Wang, Yungeng Liu, Bingyu Liang, Haijin Zeng, Yongyong Chen

arXiv:2601.23103v12.6h-index: 8Has Code

Originality Incremental advance

AI Analysis

This addresses the need for more robust solutions in clinical workflows by synergizing low-level and high-level medical image processing tasks, though it is incremental as it builds on existing deep unfolding and Mamba mechanisms.

The paper tackled the joint problem of medical image restoration and segmentation by proposing VL-DUN, a framework that integrates these tasks for mutual refinement, resulting in improvements of 0.92 dB in PSNR and 9.76% in Dice coefficient.

We propose VL-DUN, a principled framework for joint All-in-One Medical Image Restoration and Segmentation (AiOMIRS) that bridges the gap between low-level signal recovery and high-level semantic understanding. While standard pipelines treat these tasks in isolation, our core insight is that they are fundamentally synergistic: restoration provides clean anatomical structures to improve segmentation, while semantic priors regularize the restoration process. VL-DUN resolves the sub-optimality of sequential processing through two primary innovations. (1) We formulate AiOMIRS as a unified optimization problem, deriving an interpretable joint unfolding mechanism where restoration and segmentation are mathematically coupled for mutual refinement. (2) We introduce a frequency-aware Mamba mechanism to capture long-range dependencies for global segmentation while preserving the high-frequency textures necessary for restoration. This allows for efficient global context modeling with linear complexity, effectively mitigating the spectral bias of standard architectures. As a pioneering work in the AiOMIRS task, VL-DUN establishes a new state-of-the-art across multi-modal benchmarks, improving PSNR by 0.92 dB and the Dice coefficient by 9.76\%. Our results demonstrate that joint collaborative learning offers a superior, more robust solution for complex clinical workflows compared to isolated task processing. The codes are provided in https://github.com/cipi666/VLDUN.

View on arXiv PDF Code

Similar