Deep Modeling and Optimization of Medical Image Classification
This work addresses data privacy and generalization challenges in medical AI, but it appears incremental as it adapts existing methods like CLIP and federated learning to the medical domain.
The paper tackles the problem of medical image classification with limited data due to privacy concerns by introducing a novel CLIP variant with multiple CNNs and ViTs, combining deep models with federated learning, and using traditional ML methods for generalization. Results show MaxViT achieves 87.03% average test metrics on HAM10000 with multimodal learning, ConvNeXt_L reaches 83.98% F1-score in federated learning, and SVM improves metrics by ~2% for Swin transformers on ISIC2018.
Deep models, such as convolutional neural networks (CNNs) and vision transformer (ViT), demonstrate remarkable performance in image classification. However, those deep models require large data to fine-tune, which is impractical in the medical domain due to the data privacy issue. Furthermore, despite the feasible performance of contrastive language image pre-training (CLIP) in the natural domain, the potential of CLIP has not been fully investigated in the medical field. To face these challenges, we considered three scenarios: 1) we introduce a novel CLIP variant using four CNNs and eight ViTs as image encoders for the classification of brain cancer and skin cancer, 2) we combine 12 deep models with two federated learning techniques to protect data privacy, and 3) we involve traditional machine learning (ML) methods to improve the generalization ability of those deep models in unseen domain data. The experimental results indicate that maxvit shows the highest averaged (AVG) test metrics (AVG = 87.03\%) in HAM10000 dataset with multimodal learning, while convnext\_l demonstrates remarkable test with an F1-score of 83.98\% compared to swin\_b with 81.33\% in FL model. Furthermore, the use of support vector machine (SVM) can improve the overall test metrics with AVG of $\sim 2\%$ for swin transformer series in ISIC2018. Our codes are available at https://github.com/AIPMLab/SkinCancerSimulation.