CVAIDec 7, 2023

Joint-Individual Fusion Structure with Fusion Attention Module for Multi-Modal Skin Cancer Classification

arXiv:2312.04189v17 citationsh-index: 69
Originality Incremental advance
AI Analysis

This work addresses the need for more accurate skin cancer diagnosis by integrating multi-modal data, offering an incremental improvement over existing fusion techniques in medical imaging.

The paper tackles skin cancer classification by combining dermatological images with patient metadata, proposing a joint-individual fusion structure and fusion attention module that outperforms state-of-the-art methods on three public datasets with improved accuracy across CNN backbones.

Most convolutional neural network (CNN) based methods for skin cancer classification obtain their results using only dermatological images. Although good classification results have been shown, more accurate results can be achieved by considering the patient's metadata, which is valuable clinical information for dermatologists. Current methods only use the simple joint fusion structure (FS) and fusion modules (FMs) for the multi-modal classification methods, there still is room to increase the accuracy by exploring more advanced FS and FM. Therefore, in this paper, we design a new fusion method that combines dermatological images (dermoscopy images or clinical images) and patient metadata for skin cancer classification from the perspectives of FS and FM. First, we propose a joint-individual fusion (JIF) structure that learns the shared features of multi-modality data and preserves specific features simultaneously. Second, we introduce a fusion attention (FA) module that enhances the most relevant image and metadata features based on both the self and mutual attention mechanism to support the decision-making pipeline. We compare the proposed JIF-MMFA method with other state-of-the-art fusion methods on three different public datasets. The results show that our JIF-MMFA method improves the classification results for all tested CNN backbones and performs better than the other fusion methods on the three public datasets, demonstrating our method's effectiveness and robustness

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes