CVJun 2, 2023

A Novel Vision Transformer with Residual in Self-attention for Biomedical Image Classification

arXiv:2306.01594v211 citationsh-index: 29
Originality Incremental advance
AI Analysis

This work addresses biomedical image classification challenges like limited samples and imbalanced data, but it is incremental as it modifies an existing ViT method.

The authors tackled biomedical image classification by proposing a vision transformer with residual connections in self-attention, achieving significant improvements over traditional ViT and convolutional models on blood cell and brain tumor datasets.

Biomedical image classification requires capturing of bio-informatics based on specific feature distribution. In most of such applications, there are mainly challenges due to limited availability of samples for diseased cases and imbalanced nature of dataset. This article presents the novel framework of multi-head self-attention for vision transformer (ViT) which makes capable of capturing the specific image features for classification and analysis. The proposed method uses the concept of residual connection for accumulating the best attention output in each block of multi-head attention. The proposed framework has been evaluated on two small datasets: (i) blood cell classification dataset and (ii) brain tumor detection using brain MRI images. The results show the significant improvement over traditional ViT and other convolution based state-of-the-art classification models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes