CVAug 27, 2024
Pre-training Everywhere: Parameter-Efficient Fine-Tuning for Medical Image Analysis via Target Parameter Pre-trainingXingliang Lei, Yiwen Ye, Zhisong Wang et al.
Parameter-efficient fine-tuning (PEFT) techniques have emerged to address overfitting and high computational costs associated with fully fine-tuning in self-supervised learning. Mainstream PEFT methods add a few trainable parameters while keeping the pre-trained backbone parameters fixed. These methods achieve comparative, and often superior, performance to fully fine-tuning, demonstrating the powerful representation ability of the pre-trained backbone. Despite this success, these methods typically ignore the initialization of the new parameters, often relying solely on random initialization. We argue that if pre-training is significantly beneficial, it should be applied to all parameters requiring representational capacity. Motivated by this, we propose Target Parameter Pre-training (TPP), a simple yet effective fine-tuning framework. TPP pre-trains target parameters, i.e., the new parameters introduced during fine-tuning, in an additional stage before PEFT. During this stage, the pre-trained backbone parameters are frozen, and only the new parameters are trainable. A defined pretext task encourages the new parameters to learn specific representations of downstream data. Subsequently, when PEFT is employed, the pre-trained new parameters are loaded to enhance fine-tuning efficiency. The proposed TPP framework is versatile, allowing integration with various pre-trained backbones, pretext tasks, and PEFT methods. We evaluated the fine-tuning performance of our method on seven public datasets, covering four modalities and two task types. The results demonstrate that TPP can be easily integrated into existing PEFT methods, significantly improving performance.
CVNov 15, 2024
CoSAM: Self-Correcting SAM for Domain Generalization in 2D Medical Image SegmentationYihang Fu, Ziyang Chen, Yiwen Ye et al.
Medical images often exhibit distribution shifts due to variations in imaging protocols and scanners across different medical centers. Domain Generalization (DG) methods aim to train models on source domains that can generalize to unseen target domains. Recently, the segment anything model (SAM) has demonstrated strong generalization capabilities due to its prompt-based design, and has gained significant attention in image segmentation tasks. Existing SAM-based approaches attempt to address the need for manual prompts by introducing prompt generators that automatically generate these prompts. However, we argue that auto-generated prompts may not be sufficiently accurate under distribution shifts, potentially leading to incorrect predictions that still require manual verification and correction by clinicians. To address this challenge, we propose a method for 2D medical image segmentation called Self-Correcting SAM (CoSAM). Our approach begins by generating coarse masks using SAM in a prompt-free manner, providing prior prompts for the subsequent stages, and eliminating the need for prompt generators. To automatically refine these coarse masks, we introduce a generalized error decoder that simulates the correction process typically performed by clinicians. Furthermore, we generate diverse prompts as feedback based on the corrected masks, which are used to iteratively refine the predictions within a self-correcting loop, enhancing the generalization performance of our model. Extensive experiments on two medical image segmentation benchmarks across multiple scenarios demonstrate the superiority of CoSAM over state-of-the-art SAM-based methods.