IVApr 7, 2022Code
Surface Vision Transformers: Flexible Attention-Based Modelling of Biomedical SurfacesSimon Dahan, Hao Xu, Logan Z. J. Williams et al.
Recent state-of-the-art performances of Vision Transformers (ViT) in computer vision tasks demonstrate that a general-purpose architecture, which implements long-range self-attention, could replace the local feature learning operations of convolutional neural networks. In this paper, we extend ViTs to surfaces by reformulating the task of surface learning as a sequence-to-sequence learning problem, by proposing patching mechanisms for general surface meshes. Sequences of patches are then processed by a transformer encoder and used for classification or regression. We validate our method on a range of different biomedical surface domains and tasks: brain age prediction in the developing Human Connectome Project (dHCP), fluid intelligence prediction in the Human Connectome Project (HCP), and coronary artery calcium score classification using surfaces from the Scottish Computed Tomography of the Heart (SCOT-HEART) dataset, and investigate the impact of pretraining and data augmentation on model performance. Results suggest that Surface Vision Transformers (SiT) demonstrate consistent improvement over geometric deep learning methods for brain age and fluid intelligence prediction and achieve comparable performance on calcium score classification to standard metrics used in clinical practice. Furthermore, analysis of transformer attention maps offers clear and individualised predictions of the features driving each task. Code is available on Github: https://github.com/metrics-lab/surface-vision-transformers
11.4LGMay 20
Machine learning prediction of obstructive coronary artery disease using opportunistic coronary calcium and epicardial fat assessments from CT calcium scoring scansJuhwan Lee, Ammar Hoori, Tao Hu et al.
Non-contrast computed tomography calcium scoring (CTCS) is a cost-effective imaging modality widely used to detect coronary artery calcifications. This study aimed to develop an advanced machine learning framework that utilizes quantitative analyses of coronary calcium and epicardial fat from CTCS images to predict obstructive coronary artery disease (CAD). The study population consisted of 1,324 patients from the SCOT-HEART clinical trial who underwent both CTCS and coronary CT angiography. We extracted and analyzed a broad range of features, including 24 clinical variables, 189 calcium-omics, and 211 epicardial fat-omics features from the CTCS images. Feature selection was conducted using the CatBoost algorithm combined with SHapley Additive exPlanation (SHAP) values. Predictive modeling utilized the CatBoost gradient boosting method, focusing on the most informative features. From an initial set of 424 candidate features, 14 were identified as most predictive through the CatBoost-SHAP method. The top two predictive features originated from fat-omics, with the remaining 12 features derived from calcium-omics. The optimized model achieved robust predictive capabilities, demonstrating a sensitivity of 83.1+/-4.6%, specificity of 93.8+/-1.7%, accuracy of 85.3+/-2.0%, and an F1 score of 73.9+/-3.3%. Inclusion of calcium-omics and fat-omics data significantly improved predictive performance. Notably, the model also showed reliable predictive accuracy in patients with diverse coronary calcium scores, including cases with obstructive CAD despite a zero-calcium score. This innovative approach holds promise for improving clinical decision-making and potentially reducing dependence on contrast-enhanced or invasive diagnostic procedures, particularly within low-to intermediate-risk patient groups.
CVNov 11, 2019Code
Disentangle, align and fuse for multimodal and semi-supervised image segmentationAgisilaos Chartsias, Giorgos Papanastasiou, Chengjia Wang et al.
Magnetic resonance (MR) protocols rely on several sequences to assess pathology and organ status properly. Despite advances in image analysis, we tend to treat each sequence, here termed modality, in isolation. Taking advantage of the common information shared between modalities (an organ's anatomy) is beneficial for multi-modality processing and learning. However, we must overcome inherent anatomical misregistrations and disparities in signal intensity across the modalities to obtain this benefit. We present a method that offers improved segmentation accuracy of the modality of interest (over a single input model), by learning to leverage information present in other modalities, even if few (semi-supervised) or no (unsupervised) annotations are available for this specific modality. Core to our method is learning a disentangled decomposition into anatomical and imaging factors. Shared anatomical factors from the different inputs are jointly processed and fused to extract more accurate segmentation masks. Image misregistrations are corrected with a Spatial Transformer Network, which non-linearly aligns the anatomical factors. The imaging factor captures signal intensity characteristics across different modality data and is used for image reconstruction, enabling semi-supervised learning. Temporal and slice pairing between inputs are learned dynamically. We demonstrate applications in Late Gadolinium Enhanced (LGE) and Blood Oxygenation Level Dependent (BOLD) cardiac segmentation, as well as in T2 abdominal segmentation. Code is available at https://github.com/vios-s/multimodal_segmentation.