Detecting Heart Disease from Multi-View Ultrasound Images via Supervised Attention Multiple Instance Learning
This work addresses under-diagnosis of aortic stenosis, a degenerative heart condition, by improving automated screening accuracy, though it is incremental as it builds on existing multiple instance learning methods.
The paper tackled automating aortic stenosis detection from multi-view ultrasound images by developing a supervised attention multiple instance learning method with self-supervised pretraining, achieving higher accuracy and reduced model size on open-access and external datasets.
Aortic stenosis (AS) is a degenerative valve condition that causes substantial morbidity and mortality. This condition is under-diagnosed and under-treated. In clinical practice, AS is diagnosed with expert review of transthoracic echocardiography, which produces dozens of ultrasound images of the heart. Only some of these views show the aortic valve. To automate screening for AS, deep networks must learn to mimic a human expert's ability to identify views of the aortic valve then aggregate across these relevant images to produce a study-level diagnosis. We find previous approaches to AS detection yield insufficient accuracy due to relying on inflexible averages across images. We further find that off-the-shelf attention-based multiple instance learning (MIL) performs poorly. We contribute a new end-to-end MIL approach with two key methodological innovations. First, a supervised attention technique guides the learned attention mechanism to favor relevant views. Second, a novel self-supervised pretraining strategy applies contrastive learning on the representation of the whole study instead of individual images as commonly done in prior literature. Experiments on an open-access dataset and an external validation set show that our approach yields higher accuracy while reducing model size.