IVCVSep 30, 2023

MVC: A Multi-Task Vision Transformer Network for COVID-19 Diagnosis from Chest X-ray Images

arXiv:2310.00418v11 citationsh-index: 24
Originality Incremental advance
AI Analysis

This work addresses the need for a unified multi-task framework in medical image analysis for COVID-19 diagnosis, which is incremental as it builds upon existing vision transformer methods.

The authors tackled the problem of COVID-19 diagnosis from chest X-ray images by proposing a multi-task vision transformer network that simultaneously classifies images and identifies affected regions, achieving superior performance over baselines on both tasks.

Medical image analysis using computer-based algorithms has attracted considerable attention from the research community and achieved tremendous progress in the last decade. With recent advances in computing resources and availability of large-scale medical image datasets, many deep learning models have been developed for disease diagnosis from medical images. However, existing techniques focus on sub-tasks, e.g., disease classification and identification, individually, while there is a lack of a unified framework enabling multi-task diagnosis. Inspired by the capability of Vision Transformers in both local and global representation learning, we propose in this paper a new method, namely Multi-task Vision Transformer (MVC) for simultaneously classifying chest X-ray images and identifying affected regions from the input data. Our method is built upon the Vision Transformer but extends its learning capability in a multi-task setting. We evaluated our proposed method and compared it with existing baselines on a benchmark dataset of COVID-19 chest X-ray images. Experimental results verified the superiority of the proposed method over the baselines on both the image classification and affected region identification tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes