CVAug 6, 2025

Small Lesions-aware Bidirectional Multimodal Multiscale Fusion Network for Lung Disease Classification

arXiv:2508.04205v1h-index: 12Has CodeMICCAI
Originality Incremental advance
AI Analysis

This work addresses the problem of accurate lung disease diagnosis for medical practitioners, though it appears incremental as it builds on existing multimodal deep learning approaches.

The paper tackles the challenge of misdiagnosing small lesions in lung disease classification by proposing a multimodal multiscale fusion network, which significantly improves diagnostic accuracy on the Lung-PET-CT-Dx dataset, surpassing state-of-the-art methods.

The diagnosis of medical diseases faces challenges such as the misdiagnosis of small lesions. Deep learning, particularly multimodal approaches, has shown great potential in the field of medical disease diagnosis. However, the differences in dimensionality between medical imaging and electronic health record data present challenges for effective alignment and fusion. To address these issues, we propose the Multimodal Multiscale Cross-Attention Fusion Network (MMCAF-Net). This model employs a feature pyramid structure combined with an efficient 3D multi-scale convolutional attention module to extract lesion-specific features from 3D medical images. To further enhance multimodal data integration, MMCAF-Net incorporates a multi-scale cross-attention module, which resolves dimensional inconsistencies, enabling more effective feature fusion. We evaluated MMCAF-Net on the Lung-PET-CT-Dx dataset, and the results showed a significant improvement in diagnostic accuracy, surpassing current state-of-the-art methods. The code is available at https://github.com/yjx1234/MMCAF-Net

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes