CVMar 24, 2022Code
Transformer Compressed Sensing via Global Image TokensMarlon Bran Lorenzana, Craig Engstrom, Shekhar S. Chandra
Convolutional neural networks (CNN) have demonstrated outstanding Compressed Sensing (CS) performance compared to traditional, hand-crafted methods. However, they are broadly limited in terms of generalisability, inductive bias and difficulty to model long distance relationships. Transformer neural networks (TNN) overcome such issues by implementing an attention mechanism designed to capture dependencies between inputs. However, high-resolution tasks typically require vision Transformers (ViT) to decompose an image into patch-based tokens, limiting inputs to inherently local contexts. We propose a novel image decomposition that naturally embeds images into low-resolution inputs. These Kaleidoscope tokens (KD) provide a mechanism for global attention, at the same computational cost as a patch-based approach. To showcase this development, we replace CNN components in a well-known CS-MRI neural network with TNN blocks and demonstrate the improvements afforded by KD. We also propose an ensemble of image tokens, which enhance overall image quality and reduces model size. Supplementary material is available: https://github.com/uqmarlonbran/TCS.git
CVMar 24, 2023
Evidence-aware multi-modal data fusion and its application to total knee replacement predictionXinwen Liu, Jing Wang, S. Kevin Zhou et al.
Deep neural networks have been widely studied for predicting a medical condition, such as total knee replacement (TKR). It has shown that data of different modalities, such as imaging data, clinical variables and demographic information, provide complementary information and thus can improve the prediction accuracy together. However, the data sources of various modalities may not always be of high quality, and each modality may have only partial information of medical condition. Thus, predictions from different modalities can be opposite, and the final prediction may fail in the presence of such a conflict. Therefore, it is important to consider the reliability of each source data and the prediction output when making a final decision. In this paper, we propose an evidence-aware multi-modal data fusion framework based on the Dempster-Shafer theory (DST). The backbone models contain an image branch, a non-image branch and a fusion branch. For each branch, there is an evidence network that takes the extracted features as input and outputs an evidence score, which is designed to represent the reliability of the output from the current branch. The output probabilities along with the evidence scores from multiple branches are combined with the Dempster's combination rule to make a final prediction. Experimental results on the public OA initiative (OAI) dataset for the TKR prediction task show the superiority of the proposed fusion strategy on various backbone models.
IVSep 22, 2022
Structure Guided Manifolds for Discovery of Disease CharacteristicsSiyu Liu, Linfeng Liu, Xuan Vinh et al.
In medical image analysis, the subtle visual characteristics of many diseases are challenging to discern, particularly due to the lack of paired data. For example, in mild Alzheimer's Disease (AD), brain tissue atrophy can be difficult to observe from pure imaging data, especially without paired AD and Cognitively Normal ( CN ) data for comparison. This work presents Disease Discovery GAN ( DiDiGAN), a weakly-supervised style-based framework for discovering and visualising subtle disease features. DiDiGAN learns a disease manifold of AD and CN visual characteristics, and the style codes sampled from this manifold are imposed onto an anatomical structural "blueprint" to synthesise paired AD and CN magnetic resonance images (MRIs). To suppress non-disease-related variations between the generated AD and CN pairs, DiDiGAN leverages a structural constraint with cycle consistency and anti-aliasing to enforce anatomical correspondence. When tested on the Alzheimer's Disease Neuroimaging Initiative ( ADNI) dataset, DiDiGAN showed key AD characteristics (reduced hippocampal volume, ventricular enlargement, and atrophy of cortical structures) through synthesising paired AD and CN scans. The qualitative results were backed up by automated brain volume analysis, where systematic pair-wise reductions in brain tissue structures were also measured
IVOct 30, 2025
SAMRI: Segment Anything Model for MRIZhao Wang, Wei Dai, Thuy Thanh Dao et al.
Accurate magnetic resonance imaging (MRI) segmentation is crucial for clinical decision-making, but remains labor-intensive when performed manually. Convolutional neural network (CNN)-based methods can be accurate and efficient, but often generalize poorly to MRI's variable contrast, intensity inhomogeneity, and protocols. Although the transformer-based Segment Anything Model (SAM) has demonstrated remarkable generalizability in natural images, existing adaptations often treat MRI as another imaging modality, overlooking these modality-specific challenges. We present SAMRI, an MRI-specialized SAM trained and validated on 1.1 million labeled MR slices spanning whole-body organs and pathologies. We demonstrate that SAM can be effectively adapted to MRI by simply fine-tuning its mask decoder using a two-stage strategy, reducing training time by 94% and trainable parameters by 96% versus full-model retraining. Across diverse MRI segmentation tasks, SAMRI achieves a mean Dice of 0.87, delivering state-of-the-art accuracy across anatomical regions and robust generalization on unseen structures, particularly small and clinically important structures.
IVNov 30, 2022
Automated anomaly-aware 3D segmentation of bones and cartilages in knee MR images from the Osteoarthritis InitiativeBoyeong Woo, Craig Engstrom, William Baresic et al.
In medical image analysis, automated segmentation of multi-component anatomical structures, which often have a spectrum of potential anomalies and pathologies, is a challenging task. In this work, we develop a multi-step approach using U-Net-based neural networks to initially detect anomalies (bone marrow lesions, bone cysts) in the distal femur, proximal tibia and patella from 3D magnetic resonance (MR) images of the knee in individuals with varying grades of osteoarthritis. Subsequently, the extracted data are used for downstream tasks involving semantic segmentation of individual bone and cartilage volumes as well as bone anomalies. For anomaly detection, the U-Net-based models were developed to reconstruct the bone profiles of the femur and tibia in images via inpainting so anomalous bone regions could be replaced with close to normal appearances. The reconstruction error was used to detect bone anomalies. A second anomaly-aware network, which was compared to anomaly-naïve segmentation networks, was used to provide a final automated segmentation of the femoral, tibial and patellar bones and cartilages from the knee MR images containing a spectrum of bone anomalies. The anomaly-aware segmentation approach provided up to 58% reduction in Hausdorff distances for bone segmentations compared to the results from the anomaly-naïve segmentation networks. In addition, the anomaly-aware networks were able to detect bone lesions in the MR images with greater sensitivity and specificity (area under the receiver operating characteristic curve [AUC] up to 0.896) compared to the anomaly-naïve segmentation networks (AUC up to 0.874).
CVNov 27, 2020
Manipulating Medical Image Translation with Manifold DisentanglementSiyu Liu, Jason A. Dowling, Craig Engstrom et al.
Medical image translation (e.g. CT to MR) is a challenging task as it requires I) faithful translation of domain-invariant features (e.g. shape information of anatomical structures) and II) realistic synthesis of target-domain features (e.g. tissue appearance in MR). In this work, we propose Manifold Disentanglement Generative Adversarial Network (MDGAN), a novel image translation framework that explicitly models these two types of features. It employs a fully convolutional generator to model domain-invariant features, and it uses style codes to separately model target-domain features as a manifold. This design aims to explicitly disentangle domain-invariant features and domain-specific features while gaining individual control of both. The image translation process is formulated as a stylisation task, where the input is "stylised" (translated) into diverse target-domain images based on style codes sampled from the learnt manifold. We test MDGAN for multi-modal medical image translation, where we create two domain-specific manifold clusters on the manifold to translate segmentation maps into pseudo-CT and pseudo-MR images, respectively. We show that by traversing a path across the MR manifold cluster, the target output can be manipulated while still retaining the shape information from the input.
IVJun 28, 2020
Generalisable 3D Fabric Architecture for Streamlined Universal Multi-Dataset Medical Image SegmentationSiyu Liu, Wei Dai, Craig Engstrom et al.
Data scarcity is common in deep learning models for medical image segmentation. Previous works proposed multi-dataset learning, either simultaneously or via transfer learning to expand training sets. However, medical image datasets have diverse-sized images and features, and developing a model simultaneously for multiple datasets is challenging. This work proposes Fabric Image Representation Encoding Network (FIRENet), a universal architecture for simultaneous multi-dataset segmentation and transfer learning involving arbitrary numbers of dataset(s). To handle different-sized image and feature, a 3D fabric module is used to encapsulate many multi-scale sub-architectures. An optimal combination of these sub-architectures can be implicitly learnt to best suit the target dataset(s). For diverse-scale feature extraction, a 3D extension of atrous spatial pyramid pooling (ASPP3D) is used in each fabric node for a fine-grained coverage of rich-scale image features. In the first experiment, FIRENet performed 3D universal bone segmentation of multiple musculoskeletal datasets of the human knee, shoulder and hip joints and exhibited excellent simultaneous multi-dataset segmentation performance. When tested for transfer learning, FIRENet further exhibited excellent single dataset performance (when pre-training on a prostate dataset), as well as significantly improved universal bone segmentation performance. The following experiment involves the simultaneous segmentation of the 10 Medical Segmentation Decathlon (MSD) challenge datasets. FIRENet demonstrated good multi-dataset segmentation results and inter-dataset adaptability of highly diverse image sizes. In both experiments, FIRENet's streamlined multi-dataset learning with one unified network that requires no hyper-parameter tuning.