CVAIFeb 24, 2024

General Purpose Image Encoder DINOv2 for Medical Image Registration

arXiv:2402.15687v118 citationsh-index: 20
Originality Incremental advance
AI Analysis

This addresses the need for robust medical image alignment without large training datasets, though it is an incremental application of existing foundation models.

The paper tackles the problem of medical image registration by introducing DINO-Reg, a training-free method that uses the general-purpose image encoder DINOv2 for feature extraction, achieving first place in the OncoReg Challenge.

Existing medical image registration algorithms rely on either dataset specific training or local texture-based features to align images. The former cannot be reliably implemented without large modality-specific training datasets, while the latter lacks global semantics thus could be easily trapped at local minima. In this paper, we present a training-free deformable image registration method, DINO-Reg, leveraging a general purpose image encoder DINOv2 for image feature extraction. The DINOv2 encoder was trained using the ImageNet data containing natural images. We used the pretrained DINOv2 without any finetuning. Our method feeds the DINOv2 encoded features into a discrete optimizer to find the optimal deformable registration field. We conducted a series of experiments to understand the behavior and role of such a general purpose image encoder in the application of image registration. Combined with handcrafted features, our method won the first place in the recent OncoReg Challenge. To our knowledge, this is the first application of general vision foundation models in medical image registration.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes