CVMar 17, 2025

Enhancing zero-shot learning in medical imaging: integrating clip with advanced techniques for improved chest x-ray analysis

Prakhar Bhardwaj, Sheethal Bhat, Andreas Maier

arXiv:2503.13134v13 citationsh-index: 2

Originality Incremental advance

AI Analysis

This work addresses the challenge of diagnosing thoracic diseases from chest X-rays with limited labeled data, representing an incremental improvement over existing zero-shot learning methods.

The paper tackled the problem of limited labeled medical imaging data by enhancing zero-shot learning for chest X-ray analysis, achieving a 6.5% relative improvement over the state-of-the-art CheXZero model on the NIH ChestXray14 dataset and an average AUC of 0.750 on the CheXpert dataset.

Due to the large volume of medical imaging data, advanced AI methodologies are needed to assist radiologists in diagnosing thoracic diseases from chest X-rays (CXRs). Existing deep learning models often require large, labeled datasets, which are scarce in medical imaging due to the time-consuming and expert-driven annotation process. In this paper, we extend the existing approach to enhance zero-shot learning in medical imaging by integrating Contrastive Language-Image Pre-training (CLIP) with Momentum Contrast (MoCo), resulting in our proposed model, MoCoCLIP. Our method addresses challenges posed by class-imbalanced and unlabeled datasets, enabling improved detection of pulmonary pathologies. Experimental results on the NIH ChestXray14 dataset demonstrate that MoCoCLIP outperforms the state-of-the-art CheXZero model, achieving relative improvement of approximately 6.5%. Furthermore, on the CheXpert dataset, MoCoCLIP demonstrates superior zero-shot performance, achieving an average AUC of 0.750 compared to CheXZero with 0.746 AUC, highlighting its enhanced generalization capabilities on unseen data.

View on arXiv PDF

Similar