CV AINov 18, 2022

Vision Transformers in Medical Imaging: A Review

Emerald U. Henry, Onyeka Emebob, Conrad Asotie Omonhinmin

arXiv:2211.10043v169 citationsh-index: 16

Originality Synthesis-oriented

AI Analysis

This review helps medical imaging researchers understand if transformer architectures can translate their success from computer vision to medical applications.

This paper provides a comprehensive review of how Vision Transformers are being applied in medical imaging, comparing their performance to convolutional neural networks across tasks like classification, segmentation, registration, and reconstruction on standard medical datasets.

Transformer, a model comprising attention-based encoder-decoder architecture, have gained prevalence in the field of natural language processing (NLP) and recently influenced the computer vision (CV) space. The similarities between computer vision and medical imaging, reviewed the question among researchers if the impact of transformers on computer vision be translated to medical imaging? In this paper, we attempt to provide a comprehensive and recent review on the application of transformers in medical imaging by; describing the transformer model comparing it with a diversity of convolutional neural networks (CNNs), detailing the transformer based approaches for medical image classification, segmentation, registration and reconstruction with a focus on the image modality, comparing the performance of state-of-the-art transformer architectures to best performing CNNs on standard medical datasets.

View on arXiv PDF

Similar