LGAICLFeb 3, 2021

MUFASA: Multimodal Fusion Architecture Search for Electronic Health Records

arXiv:2102.02340v270 citations
Originality Incremental advance
AI Analysis

This work provides an incremental improvement for deep learning practitioners working with multimodal EHR data by automating the search for effective fusion and architecture strategies.

The paper addresses the challenge of multimodal data fusion in electronic health records (EHR) by proposing MUFASA, a neural architecture search method that simultaneously optimizes fusion strategies and modality-specific architectures. MUFASA improved top-5 recall from 0.88 to 0.91 for CCS diagnosis code prediction compared to Transformer and Evolved Transformer baselines.

One important challenge of applying deep learning to electronic health records (EHR) is the complexity of their multimodal structure. EHR usually contains a mixture of structured (codes) and unstructured (free-text) data with sparse and irregular longitudinal features -- all of which doctors utilize when making decisions. In the deep learning regime, determining how different modality representations should be fused together is a difficult problem, which is often addressed by handcrafted modeling and intuition. In this work, we extend state-of-the-art neural architecture search (NAS) methods and propose MUltimodal Fusion Architecture SeArch (MUFASA) to simultaneously search across multimodal fusion strategies and modality-specific architectures for the first time. We demonstrate empirically that our MUFASA method outperforms established unimodal NAS on public EHR data with comparable computation costs. In addition, MUFASA produces architectures that outperform Transformer and Evolved Transformer. Compared with these baselines on CCS diagnosis code prediction, our discovered models improve top-5 recall from 0.88 to 0.91 and demonstrate the ability to generalize to other EHR tasks. Studying our top architecture in depth, we provide empirical evidence that MUFASA's improvements are derived from its ability to both customize modeling for each data modality and find effective fusion strategies.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes