Co-Attentive Cross-Modal Deep Learning for Medical Evidence Synthesis and Decision Making
This work addresses the need for efficient and accurate medical evidence synthesis for disease diagnosis, though it is incremental as it builds on existing cross-modal approaches.
The paper tackled the problem of synthesizing multimodal medical data by proposing a cross-modal deep learning architecture with co-attention, which improved Parkinson's Disease diagnosis accuracy by 2.35% over previous methods and reduced parameter usage by 53%.
Modern medicine requires generalised approaches to the synthesis and integration of multimodal data, often at different biological scales, that can be applied to a variety of evidence structures, such as complex disease analyses and epidemiological models. However, current methods are either slow and expensive, or ineffective due to the inability to model the complex relationships between data modes which differ in scale and format. We address these issues by proposing a cross-modal deep learning architecture and co-attention mechanism to accurately model the relationships between the different data modes, while further reducing patient diagnosis time. Differentiating Parkinson's Disease (PD) patients from healthy patients forms the basis of the evaluation. The model outperforms the previous state-of-the-art unimodal analysis by 2.35%, while also being 53% more parameter efficient than the industry standard cross-modal model. Furthermore, the evaluation of the attention coefficients allows for qualitative insights to be obtained. Through the coupling with bioinformatics, a novel link between the interferon-gamma-mediated pathway, DNA methylation and PD was identified. We believe that our approach is general and could optimise the process of medical evidence synthesis and decision making in an actionable way.