Less Could Be Better: Parameter-efficient Fine-tuning Advances Medical Vision Foundation Models
This work addresses the problem of efficient transfer learning for medical imaging tasks, showing incremental improvements in performance with reduced computational costs.
The study investigated parameter-efficient fine-tuning (PEFT) for medical vision foundation models, finding that LoRA outperformed full-parameter fine-tuning in 13 out of 18 tasks by up to 2.9% using fewer than 1% tunable parameters and achieved an AUROC of 80.6% with 1% labeled data on NIH ChestX-ray14.
Parameter-efficient fine-tuning (PEFT) that was initially developed for exploiting pre-trained large language models has recently emerged as an effective approach to perform transfer learning on computer vision tasks. However, the effectiveness of PEFT on medical vision foundation models is still unclear and remains to be explored. As a proof of concept, we conducted a detailed empirical study on applying PEFT to chest radiography foundation models. Specifically, we delved into LoRA, a representative PEFT method, and compared it against full-parameter fine-tuning (FFT) on two self-supervised radiography foundation models across three well-established chest radiograph datasets. Our results showed that LoRA outperformed FFT in 13 out of 18 transfer learning tasks by at most 2.9% using fewer than 1% tunable parameters. Combining LoRA with foundation models, we set up new state-of-the-art on a range of data-efficient learning tasks, such as an AUROC score of 80.6% using 1% labeled data on NIH ChestX-ray14. We hope this study can evoke more attention from the community in the use of PEFT for transfer learning on medical imaging tasks. Code and models are available at https://github.com/RL4M/MED-PEFT.