CVOct 26, 2023

Task-driven Prompt Evolution for Foundation Models

arXiv:2310.17128v13 citationsh-index: 9
Originality Incremental advance
AI Analysis

This work addresses the performance gap for medical imaging applications using foundation models, representing an incremental advancement in automatic visual prompt-tuning.

The paper tackles the problem of underperforming promptable foundation models like SAM in medical image segmentation by proposing a plug-and-play prompt optimization technique (SAMPOT) that uses downstream tasks to improve prompts, resulting in an improvement in approximately 75% of cases for lung segmentation in chest X-rays.

Promptable foundation models, particularly Segment Anything Model (SAM), have emerged as a promising alternative to the traditional task-specific supervised learning for image segmentation. However, many evaluation studies have found that their performance on medical imaging modalities to be underwhelming compared to conventional deep learning methods. In the world of large pre-trained language and vision-language models, learning prompt from downstream tasks has achieved considerable success in improving performance. In this work, we propose a plug-and-play Prompt Optimization Technique for foundation models like SAM (SAMPOT) that utilizes the downstream segmentation task to optimize the human-provided prompt to obtain improved performance. We demonstrate the utility of SAMPOT on lung segmentation in chest X-ray images and obtain an improvement on a significant number of cases ($\sim75\%$) over human-provided initial prompts. We hope this work will lead to further investigations in the nascent field of automatic visual prompt-tuning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes