IVCVLGMar 10, 2022

Self Pre-training with Masked Autoencoders for Medical Image Classification and Segmentation

arXiv:2203.05573v2143 citationsh-index: 32Has Code
Originality Incremental advance
AI Analysis

This addresses the problem of limited pre-training data for medical image analysis, offering a practical solution for scenarios where acquiring such data is difficult, though it is incremental as it adapts an existing method to a new domain.

The paper tackled the lack of large-scale medical image datasets for pre-training by proposing a self pre-training method using Masked Autoencoders (MAE) on target datasets, which improved performance on tasks like chest X-ray classification, CT segmentation, and MRI segmentation.

Masked Autoencoder (MAE) has recently been shown to be effective in pre-training Vision Transformers (ViT) for natural image analysis. By reconstructing full images from partially masked inputs, a ViT encoder aggregates contextual information to infer masked image regions. We believe that this context aggregation ability is particularly essential to the medical image domain where each anatomical structure is functionally and mechanically connected to other structures and regions. Because there is no ImageNet-scale medical image dataset for pre-training, we investigate a self pre-training paradigm with MAE for medical image analysis tasks. Our method pre-trains a ViT on the training set of the target data instead of another dataset. Thus, self pre-training can benefit more scenarios where pre-training data is hard to acquire. Our experimental results show that MAE self pre-training markedly improves diverse medical image tasks including chest X-ray disease classification, abdominal CT multi-organ segmentation, and MRI brain tumor segmentation. Code is available at https://github.com/cvlab-stonybrook/SelfMedMAE

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes