IV CVSep 30, 2023

MVC: A Multi-Task Vision Transformer Network for COVID-19 Diagnosis from Chest X-ray Images

Huyen Tran, Duc Thanh Nguyen, John Yearwood

arXiv:2310.00418v13.01 citationsh-index: 24

Originality Incremental advance

AI Analysis

This work addresses the need for a unified multi-task framework in medical image analysis for COVID-19 diagnosis, which is incremental as it builds upon existing vision transformer methods.

The authors tackled the problem of COVID-19 diagnosis from chest X-ray images by proposing a multi-task vision transformer network that simultaneously classifies images and identifies affected regions, achieving superior performance over baselines on both tasks.

Medical image analysis using computer-based algorithms has attracted considerable attention from the research community and achieved tremendous progress in the last decade. With recent advances in computing resources and availability of large-scale medical image datasets, many deep learning models have been developed for disease diagnosis from medical images. However, existing techniques focus on sub-tasks, e.g., disease classification and identification, individually, while there is a lack of a unified framework enabling multi-task diagnosis. Inspired by the capability of Vision Transformers in both local and global representation learning, we propose in this paper a new method, namely Multi-task Vision Transformer (MVC) for simultaneously classifying chest X-ray images and identifying affected regions from the input data. Our method is built upon the Vision Transformer but extends its learning capability in a multi-task setting. We evaluated our proposed method and compared it with existing baselines on a benchmark dataset of COVID-19 chest X-ray images. Experimental results verified the superiority of the proposed method over the baselines on both the image classification and affected region identification tasks.

View on arXiv PDF

Similar