Surajit Ray

2papers

2 Papers

CVSep 9, 2025Code
Data-Efficient Fine-Tuning of Vision-Language Models for Diagnosis of Alzheimer's Disease

Fangqi Cheng, Surajit Ray, Xiaochen Yang

Medical vision-language models (Med-VLMs) have shown impressive results in tasks such as report generation and visual question answering, but they still face several limitations. Most notably, they underutilize patient metadata and lack integration of clinical diagnostic knowledge. Moreover, most existing models are typically trained from scratch or fine-tuned on large-scale 2D image-text pairs, requiring extensive computational resources, and their effectiveness on 3D medical imaging is often limited due to the absence of structural information. To address these gaps, we propose a data-efficient fine-tuning pipeline to adapt 3D CT-based Med-VLMs for 3D MRI and demonstrate its application in Alzheimer's disease (AD) diagnosis. Our system introduces two key innovations. First, we convert structured metadata into synthetic reports, enriching textual input for improved image-text alignment. Second, we add an auxiliary token trained to predict the mini-mental state examination (MMSE) score, a widely used clinical measure of cognitive function that correlates with AD severity. This provides additional supervision for fine-tuning. Applying lightweight prompt tuning to both image and text modalities, our approach achieves state-of-the-art performance on ADNI with only 1,504 training MRIs, outperforming methods trained on 27,161 MRIs, and shows strong zero-shot generalization on OASIS-2 and AIBL. Code is available at https://github.com/CFQ666312/DEFT-VLM-AD.

IVApr 8, 2021
DenResCov-19: A deep transfer learning network for robust automatic classification of COVID-19, pneumonia, and tuberculosis from X-rays

Michail Mamalakis, Andrew J. Swift, Bart Vorselaars et al.

The global pandemic of COVID-19 is continuing to have a significant effect on the well-being of global population, increasing the demand for rapid testing, diagnosis, and treatment. Along with COVID-19, other etiologies of pneumonia and tuberculosis constitute additional challenges to the medical system. In this regard, the objective of this work is to develop a new deep transfer learning pipeline to diagnose patients with COVID-19, pneumonia, and tuberculosis, based on chest x-ray images. We observed in some instances DenseNet and Resnet have orthogonal performances. In our proposed model, we have created an extra layer with convolutional neural network blocks to combine these two models to establish superior performance over either model. The same strategy can be useful in other applications where two competing networks with complementary performance are observed. We have tested the performance of our proposed network on two-class (pneumonia vs healthy), three-class (including COVID-19), and four-class (including tuberculosis) classification problems. The proposed network has been able to successfully classify these lung diseases in all four datasets and has provided significant improvement over the benchmark networks of DenseNet, ResNet, and Inception-V3. These novel findings can deliver a state-of-the-art pre-screening fast-track decision network to detect COVID-19 and other lung pathologies.