NCCVLGIVJun 16, 2020

Interpretable multimodal fusion networks reveal mechanisms of brain cognition

arXiv:2006.09454v146 citations
Originality Incremental advance
AI Analysis

This work addresses the need for interpretable models in biomedical research to study disease mechanisms, though it is incremental as it builds on existing deep-network-based fusion methods.

The authors tackled the challenge of interpreting multimodal fusion models for brain cognition by developing gCAM-CCL, which simultaneously performs automated diagnosis and generates interpretable activation maps, achieving good performance in classification and revealing mechanisms such as ROI activation patterns and genetic influences on cognition.

Multimodal fusion benefits disease diagnosis by providing a more comprehensive perspective. Developing algorithms is challenging due to data heterogeneity and the complex within- and between-modality associations. Deep-network-based data-fusion models have been developed to capture the complex associations and the performance in diagnosis has been improved accordingly. Moving beyond diagnosis prediction, evaluation of disease mechanisms is critically important for biomedical research. Deep-network-based data-fusion models, however, are difficult to interpret, bringing about difficulties for studying biological mechanisms. In this work, we develop an interpretable multimodal fusion model, namely gCAM-CCL, which can perform automated diagnosis and result interpretation simultaneously. The gCAM-CCL model can generate interpretable activation maps, which quantify pixel-level contributions of the input features. This is achieved by combining intermediate feature maps using gradient-based weights. Moreover, the estimated activation maps are class-specific, and the captured cross-data associations are interest/label related, which further facilitates class-specific analysis and biological mechanism analysis. We validate the gCAM-CCL model on a brain imaging-genetic study, and show gCAM-CCL's performed well for both classification and mechanism analysis. Mechanism analysis suggests that during task-fMRI scans, several object recognition related regions of interests (ROIs) are first activated and then several downstream encoding ROIs get involved. Results also suggest that the higher cognition performing group may have stronger neurotransmission signaling while the lower cognition performing group may have problem in brain/neuron development, resulting from genetic variations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes