CVApr 27, 2025

MERA: Multimodal and Multiscale Self-Explanatory Model with Considerably Reduced Annotation for Lung Nodule Diagnosis

Jiahao Lu, Chong Yin, Silvia Ingala, Kenny Erleben, Michael Bachmann Nielsen, Sune Darkner

arXiv:2504.19357v13.61 citationsh-index: 16Has Code

Originality Incremental advance

AI Analysis

This addresses the challenge of deploying diagnostic AI in medical domains where labeled data is scarce, though it appears incremental in combining existing techniques like self-supervised learning and Vision Transformers for a specific application.

The study tackled the problem of accurate lung nodule diagnosis with limited labeled data by introducing MERA, a multimodal and multiscale self-explanatory model that integrates unsupervised and weakly supervised learning strategies. With only 1% annotated samples, MERA achieved diagnostic accuracy comparable to or exceeding state-of-the-art methods requiring full annotation on the LIDC dataset.

Lung cancer, a leading cause of cancer-related deaths globally, emphasises the importance of early detection for better patient outcomes. Pulmonary nodules, often early indicators of lung cancer, necessitate accurate, timely diagnosis. Despite Explainable Artificial Intelligence (XAI) advances, many existing systems struggle providing clear, comprehensive explanations, especially with limited labelled data. This study introduces MERA, a Multimodal and Multiscale self-Explanatory model designed for lung nodule diagnosis with considerably Reduced Annotation requirements. MERA integrates unsupervised and weakly supervised learning strategies (self-supervised learning techniques and Vision Transformer architecture for unsupervised feature extraction) and a hierarchical prediction mechanism leveraging sparse annotations via semi-supervised active learning in the learned latent space. MERA explains its decisions on multiple levels: model-level global explanations via semantic latent space clustering, instance-level case-based explanations showing similar instances, local visual explanations via attention maps, and concept explanations using critical nodule attributes. Evaluations on the public LIDC dataset show MERA's superior diagnostic accuracy and self-explainability. With only 1% annotated samples, MERA achieves diagnostic accuracy comparable to or exceeding state-of-the-art methods requiring full annotation. The model's inherent design delivers comprehensive, robust, multilevel explanations aligned closely with clinical practice, enhancing trustworthiness and transparency. Demonstrated viability of unsupervised and weakly supervised learning lowers the barrier to deploying diagnostic AI in broader medical domains. Our complete code is open-source available: https://github.com/diku-dk/credanno.

View on arXiv PDF Code

Similar