One Hyper-Initializer for All Network Architectures in Medical Image Analysis
This provides a plug-and-play solution for medical image analysis tasks, reducing the need for architecture-specific pre-training, though it is incremental in improving flexibility.
The paper tackles the inflexibility of existing pre-training methods in medical image analysis by proposing a hyper-initializer that can initialize any network architecture after a single pre-training, achieving strong performance across multiple modalities and tasks in data-limited settings.
Pre-training is essential to deep learning model performance, especially in medical image analysis tasks where limited training data are available. However, existing pre-training methods are inflexible as the pre-trained weights of one model cannot be reused by other network architectures. In this paper, we propose an architecture-irrelevant hyper-initializer, which can initialize any given network architecture well after being pre-trained for only once. The proposed initializer is a hypernetwork which takes a downstream architecture as input graphs and outputs the initialization parameters of the respective architecture. We show the effectiveness and efficiency of the hyper-initializer through extensive experimental results on multiple medical imaging modalities, especially in data-limited fields. Moreover, we prove that the proposed algorithm can be reused as a favorable plug-and-play initializer for any downstream architecture and task (both classification and segmentation) of the same modality.