CVAILGJun 3, 2021

Semantic-Aware Contrastive Learning for Multi-object Medical Image Segmentation

arXiv:2106.01596v212 citations
AI Analysis

This addresses the challenge of multi-object segmentation in medical imaging, offering a method to enhance encoder-decoder networks without labels, though it appears incremental as it builds on existing contrastive learning techniques.

The paper tackled the problem of adapting contrastive learning from image-level classification to pixel-level segmentation for multi-object medical images by proposing a semantic-aware approach using attention masks, resulting in improvements of 5.53% and 6.09% Dice scores on medical datasets and 2.75% mIoU on natural images.

Medical image segmentation, or computing voxelwise semantic masks, is a fundamental yet challenging task to compute a voxel-level semantic mask. To increase the ability of encoder-decoder neural networks to perform this task across large clinical cohorts, contrastive learning provides an opportunity to stabilize model initialization and enhance encoders without labels. However, multiple target objects (with different semantic meanings) may exist in a single image, which poses a problem for adapting traditional contrastive learning methods from prevalent 'image-level classification' to 'pixel-level segmentation'. In this paper, we propose a simple semantic-aware contrastive learning approach leveraging attention masks to advance multi-object semantic segmentation. Briefly, we embed different semantic objects to different clusters rather than the traditional image-level embeddings. We evaluate our proposed method on a multi-organ medical image segmentation task with both in-house data and MICCAI Challenge 2015 BTCV datasets. Compared with current state-of-the-art training strategies, our proposed pipeline yields a substantial improvement of 5.53% and 6.09% on Dice score for both medical image segmentation cohorts respectively (p-value<0.01). The performance of the proposed method is further assessed on natural images via the PASCAL VOC 2012 dataset, and achieves a substantial improvement of 2.75% on mIoU (p-value<0.01).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes