CVLGNov 14, 2022

The Role of Local Alignment and Uniformity in Image-Text Contrastive Learning on Medical Images

arXiv:2211.07254v210 citationsh-index: 128
Originality Incremental advance
AI Analysis

This work addresses the need for better pretraining methods for localized tasks like semantic segmentation or object detection in medical imaging, though it is incremental as it builds on existing contrastive learning techniques.

The paper tackled the problem of improving localized downstream tasks in medical image analysis by studying and modifying local contrastive losses in image-text contrastive learning. The result was a novel approach that outperformed methods without local losses on 12 out of 18 chest X-ray tasks.

Image-text contrastive learning has proven effective for pretraining medical image models. When targeting localized downstream tasks like semantic segmentation or object detection, additional local contrastive losses that align image regions with sentences have shown promising results. We study how local contrastive losses are related to global (per-sample) contrastive losses and which effects they have on localized medical downstream tasks. Based on a theoretical comparison, we propose to remove some components of local losses and replace others by a novel distribution prior which enforces uniformity of representations within each sample. We empirically study this approach on chest X-ray tasks and find it to be very effective, outperforming methods without local losses on 12 of 18 tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes