Hypernetwork-Based Adaptive Aggregation for Multimodal Multiple-Instance Learning in Predicting Coronary Calcium Debulking
This work addresses a specific medical imaging problem for cardiology, with incremental novelty in adapting aggregation strategies based on tabular data.
The paper tackles the problem of predicting the necessity of coronary calcium debulking from CT images by formulating it as a multimodal multiple-instance learning task, and the proposed HyperAdAgFormer method demonstrated effectiveness in experiments on a clinical dataset.
In this paper, we present the first attempt to estimate the necessity of debulking coronary artery calcifications from computed tomography (CT) images. We formulate this task as a Multiple-instance Learning (MIL) problem. The difficulty of this task lies in that physicians adjust their focus and decision criteria for device usage according to tabular data representing each patient's condition. To address this issue, we propose a hypernetwork-based adaptive aggregation transformer (HyperAdAgFormer), which adaptively modifies the feature aggregation strategy for each patient based on tabular data through a hypernetwork. The experiments using the clinical dataset demonstrated the effectiveness of HyperAdAgFormer. The code is publicly available at https://github.com/Shiku-Kaito/HyperAdAgFormer.