CVAug 30, 2023Code
MedShapeNet -- A Large-Scale Dataset of 3D Medical Shapes for Computer VisionJianning Li, Zongwei Zhou, Jiancheng Yang et al.
Prior to the deep learning era, shape was commonly used to describe the objects. Nowadays, state-of-the-art (SOTA) algorithms in medical imaging are predominantly diverging from computer vision, where voxel grids, meshes, point clouds, and implicit surface models are used. This is seen from numerous shape-related publications in premier vision conferences as well as the growing popularity of ShapeNet (about 51,300 models) and Princeton ModelNet (127,915 models). For the medical domain, we present a large collection of anatomical shapes (e.g., bones, organs, vessels) and 3D models of surgical instrument, called MedShapeNet, created to facilitate the translation of data-driven vision algorithms to medical applications and to adapt SOTA vision algorithms to medical problems. As a unique feature, we directly model the majority of shapes on the imaging data of real patients. As of today, MedShapeNet includes 23 dataset with more than 100,000 shapes that are paired with annotations (ground truth). Our data is freely accessible via a web interface and a Python application programming interface (API) and can be used for discriminative, reconstructive, and variational benchmarks as well as various applications in virtual, augmented, or mixed reality, and 3D printing. Exemplary, we present use cases in the fields of classification of brain tumors, facial and skull reconstructions, multi-class anatomy completion, education, and 3D printing. In future, we will extend the data and improve the interfaces. The project pages are: https://medshapenet.ikim.nrw/ and https://github.com/Jianningli/medshapenet-feedback
IVSep 2, 2022Code
AutoPET Challenge: Combining nn-Unet with Swin UNETR Augmented by Maximum Intensity Projection ClassifierLars Heiliger, Zdravko Marinov, Max Hasin et al.
Tumor volume and changes in tumor characteristics over time are important biomarkers for cancer therapy. In this context, FDG-PET/CT scans are routinely used for staging and re-staging of cancer, as the radiolabeled fluorodeoxyglucose is taken up in regions of high metabolism. Unfortunately, these regions with high metabolism are not specific to tumors and can also represent physiological uptake by normal functioning organs, inflammation, or infection, making detailed and reliable tumor segmentation in these scans a demanding task. This gap in research is addressed by the AutoPET challenge, which provides a public data set with FDG-PET/CT scans from 900 patients to encourage further improvement in this field. Our contribution to this challenge is an ensemble of two state-of-the-art segmentation models, the nn-Unet and the Swin UNETR, augmented by a maximum intensity projection classifier that acts like a gating mechanism. If it predicts the existence of lesions, both segmentations are combined by a late fusion approach. Our solution achieves a Dice score of 72.12\% on patients diagnosed with lung cancer, melanoma, and lymphoma in our cross-validation. Code: https://github.com/heiligerl/autopet_submission
IVNov 25, 2022Code
Open-Source Skull Reconstruction with MONAIJianning Li, André Ferreira, Behrus Puladi et al.
We present a deep learning-based approach for skull reconstruction for MONAI, which has been pre-trained on the MUG500+ skull dataset. The implementation follows the MONAI contribution guidelines, hence, it can be easily tried out and used, and extended by MONAI users. The primary goal of this paper lies in the investigation of open-sourcing codes and pre-trained deep learning models under the MONAI framework. Nowadays, open-sourcing software, especially (pre-trained) deep learning models, has become increasingly important. Over the years, medical image analysis experienced a tremendous transformation. Over a decade ago, algorithms had to be implemented and optimized with low-level programming languages, like C or C++, to run in a reasonable time on a desktop PC, which was not as powerful as today's computers. Nowadays, users have high-level scripting languages like Python, and frameworks like PyTorch and TensorFlow, along with a sea of public code repositories at hand. As a result, implementations that had thousands of lines of C or C++ code in the past, can now be scripted with a few lines and in addition executed in a fraction of the time. To put this even on a higher level, the Medical Open Network for Artificial Intelligence (MONAI) framework tailors medical imaging research to an even more convenient process, which can boost and push the whole field. The MONAI framework is a freely available, community-supported, open-source and PyTorch-based framework, that also enables to provide research contributions with pre-trained models to others. Codes and pre-trained weights for skull reconstruction are publicly available at: https://github.com/Project-MONAI/research-contributions/tree/master/SkullRec
CVJul 4, 2022
FakeNews: GAN-based generation of realistic 3D volumetric data -- A systematic review and taxonomyAndré Ferreira, Jianning Li, Kelsey L. Pomykala et al.
With the massive proliferation of data-driven algorithms, such as deep learning-based approaches, the availability of high-quality data is of great interest. Volumetric data is very important in medicine, as it ranges from disease diagnoses to therapy monitoring. When the dataset is sufficient, models can be trained to help doctors with these tasks. Unfortunately, there are scenarios where large amounts of data is unavailable. For example, rare diseases and privacy issues can lead to restricted data availability. In non-medical fields, the high cost of obtaining enough high-quality data can also be a concern. A solution to these problems can be the generation of realistic synthetic data using Generative Adversarial Networks (GANs). The existence of these mechanisms is a good asset, especially in healthcare, as the data must be of good quality, realistic, and without privacy issues. Therefore, most of the publications on volumetric GANs are within the medical domain. In this review, we provide a summary of works that generate realistic volumetric synthetic data using GANs. We therefore outline GAN-based methods in these areas with common architectures, loss functions and evaluation metrics, including their advantages and disadvantages. We present a novel taxonomy, evaluations, challenges, and research opportunities to provide a holistic overview of the current state of volumetric GANs.
CLSep 29, 2023
Multilingual Natural Language Processing Model for Radiology Reports -- The Summary is all you need!Mariana Lindo, Ana Sofia Santos, André Ferreira et al.
The impression section of a radiology report summarizes important radiology findings and plays a critical role in communicating these findings to physicians. However, the preparation of these summaries is time-consuming and error-prone for radiologists. Recently, numerous models for radiology report summarization have been developed. Nevertheless, there is currently no model that can summarize these reports in multiple languages. Such a model could greatly improve future research and the development of Deep Learning models that incorporate data from patients with different ethnic backgrounds. In this study, the generation of radiology impressions in different languages was automated by fine-tuning a model, publicly available, based on a multilingual text-to-text Transformer to summarize findings available in English, Portuguese, and German radiology reports. In a blind test, two board-certified radiologists indicated that for at least 70% of the system-generated summaries, the quality matched or exceeded the corresponding human-written summaries, suggesting substantial clinical reliability. Furthermore, this study showed that the multilingual model outperformed other models that specialized in summarizing radiology reports in only one language, as well as models that were not specifically designed for summarizing radiology reports, such as ChatGPT.
CVMay 21
OSS: Open Suturing Skills Vision-Based Assessment Challenge 2024-2025Hanna Hoffmann, Setareh Bady, Claas de Boer et al.
Achieving high levels of surgical skill through effective training is essential for optimal patient outcomes. Automated, data-driven skill assessment holds significant potential to improve surgical training. While machine learning-based methods are increasingly popular for assessing skills in minimally invasive surgery, their application to open surgery remains limited. We present the results of a dedicated MICCAI challenge designed to benchmark and advance vision-based skill assessment in open surgery. The challenge dataset comprises videos of an open suturing training task recorded with a static GoPro camera in a dry-lab setting, with instrument trajectories available in addition to the primary video modality. The OSS Challenge was hosted over two consecutive years, comprising two and three independent tasks, respectively: (1) classifying skill level into four classes, (2) predicting the full Objective Structured Assessment of Technical Skills across eight categories, and (3) tracking hands and surgical tools. Participants submitted diverse solutions including deep learning-based video models, tracking-driven methods, and hybrid approaches. General-purpose spatiotemporal video models consistently achieved the strongest performance, though conceptually diverse approaches reached competitive levels when well-executed. Predicting fine-grained OSATS scores remains challenging but benefits substantially from increased training data. Keypoint tracking proves difficult given frequent occlusions and out-of-frame instances, limiting current applicability for motion-based skill analysis. This work benchmarks innovative and diverse solutions for surgical skill assessment, highlighting both the promise and current limitations of video-based evaluation in open surgery and identifying critical directions for advancing automated skill assessment toward clinical impact.
IVJul 22, 2024
Subthalamic Nucleus segmentation in high-field Magnetic Resonance data. Is space normalization by template co-registration necessary?Tomás Lima, Igor Varga, Eduard Bakštein et al.
Deep Brain Stimulation (DBS) is one of the most successful methods to diminish late-stage Parkinson's Disease (PD) symptoms. It is a delicate surgical procedure which requires detailed pre-surgical patient's study. High-field Magnetic Resonance Imaging (MRI) has proven its improved capacity of capturing the Subthalamic Nucleus (STN) - the main target of DBS in PD - in greater detail than lower field images. Here, we present a comparison between the performance of two different Deep Learning (DL) automatic segmentation architectures, one based in the registration to a brain template and the other performing the segmentation in in the MRI acquisition native space. The study was based on publicly available high-field 7 Tesla (T) brain MRI datasets of T1-weighted and T2-weighted sequences. nnUNet was used on the segmentation step of both architectures, while the data pre and post-processing pipelines diverged. The evaluation metrics showed that the performance of the segmentation directly in the native space yielded better results for the STN segmentation, despite not showing any advantage over the template-based method for the to other analysed structures: the Red Nucleus (RN) and the Substantia Nigra (SN).
IVJul 1, 2024
Deep Dive into MRI: Exploring Deep Learning Applications in 0.55T and 7T MRIAna Carolina Alves, André Ferreira, Behrus Puladi et al.
The development of magnetic resonance imaging (MRI) for medical imaging has provided a leap forward in diagnosis, providing a safe, non-invasive alternative to techniques involving ionising radiation exposure for diagnostic purposes. It was described by Block and Purcel in 1946, and it was not until 1980 that the first clinical application of MRI became available. Since that time the MRI has gone through many advances and has altered the way diagnosing procedures are performed. Due to its ability to improve constantly, MRI has become a commonly used practice among several specialisations in medicine. Particularly starting 0.55T and 7T MRI technologies have pointed out enhanced preservation of image detail and advanced tissue characterisation. This review examines the integration of deep learning (DL) techniques into these MRI modalities, disseminating and exploring the study applications. It highlights how DL contributes to 0.55T and 7T MRI data, showcasing the potential of DL in improving and refining these technologies. The review ends with a brief overview of how MRI technology will evolve in the coming years.
CVApr 24Code
VS-DDPM: Efficient Low-Cost Diffusion Model for Medical Modality TranslationNikoo Moradi, Gijs Luijten, Behrus Hinrichs-Puladi et al.
Diffusion models produce high-quality synthetic data but suffer from slow inference. We propose 3D Variable-Step Denoising Diffusion Probabilistic Model (VS-DDPM) a framework engineered to maintain generative quality while accelerating inference by several factors. We tested our approach on four tasks (missing MRI, tumor removal, MRI-to-sCT, and CBCT-to-sCT) within the BraTS2025 and SynthRAD2025 challenges. Designed for high efficiency under hardware and time constrains imposed by both challenges. VS-DDPM achieved state-of-the-art (SOTA) performance in missing MRI synthesis, yielding Dice scores of 0.80, 0.83, and 0.88 for the enhancing tumor, tumor core, and whole tumor regions, respectively, alongside a structural similarity index (SSIM) of 0.95. For MRI tumor removal, the model attained a root mean squared error (RMSE) of 0.053, a peak signal-to-noise ratio (PSNR) of 26.77, and an SSIM of 0.918. While the framework demonstrated competitive performance in MRI-to-sCT and CBCT-to-sCT tasks, it did not reach SOTA benchmarks, potentially due to sensitivities in data pre and post-processing pipelines or specific loss function configurations. These results demonstrate that VS-DDPM provides a robust and tunable solution for high-fidelity 3D medical image synthesis. The code is available in https://github.com/andre-fs-ferreira/SynthRAD_by_Faking_it.
CVNov 7, 2024Code
Improved Multi-Task Brain Tumour Segmentation with Synthetic Data AugmentationAndré Ferreira, Tiago Jesus, Behrus Puladi et al.
This paper presents the winning solution of task 1 and the third-placed solution of task 3 of the BraTS challenge. The use of automated tools in clinical practice has increased due to the development of more and more sophisticated and reliable algorithms. However, achieving clinical standards and developing tools for real-life scenarios is a major challenge. To this end, BraTS has organised tasks to find the most advanced solutions for specific purposes. In this paper, we propose the use of synthetic data to train state-of-the-art frameworks in order to improve the segmentation of adult gliomas in a post-treatment scenario, and the segmentation of meningioma for radiotherapy planning. Our results suggest that the use of synthetic data leads to more robust algorithms, although the synthetic data generation pipeline is not directly suited to the meningioma task. In task 1, we achieved a DSC of 0.7900, 0.8076, 0.7760, 0.8926, 0.7874, 0.8938 and a HD95 of 35.63, 30.35, 44.58, 16.87, 38.19, 17.95 for ET, NETC, RC, SNFH, TC and WT, respectively and, in task 3, we achieved a DSC of 0.801 and HD95 of 38.26, in the testing phase. The code for these tasks is available at https://github.com/ShadowTwin41/BraTS_2023_2024_solutions.
CVNov 7, 2024Code
Brain Tumour Removing and Missing Modality Generation using 3D WDMAndré Ferreira, Gijs Luijten, Behrus Puladi et al.
This paper presents the second-placed solution for task 8 and the participation solution for task 7 of BraTS 2024. The adoption of automated brain analysis algorithms to support clinical practice is increasing. However, many of these algorithms struggle with the presence of brain lesions or the absence of certain MRI modalities. The alterations in the brain's morphology leads to high variability and thus poor performance of predictive models that were trained only on healthy brains. The lack of information that is usually provided by some of the missing MRI modalities also reduces the reliability of the prediction models trained with all modalities. In order to improve the performance of these models, we propose the use of conditional 3D wavelet diffusion models. The wavelet transform enabled full-resolution image training and prediction on a GPU with 48 GB VRAM, without patching or downsampling, preserving all information for prediction. The code for these tasks is available at https://github.com/ShadowTwin41/BraTS_2023_2024_solutions.
CVJan 23
Curated endoscopic retrograde cholangiopancreatography images datasetAlda João Andrade, Mónica Martins, André Ferreira et al.
Endoscopic Retrograde Cholangiopancreatography (ERCP) is a key procedure in the diagnosis and treatment of biliary and pancreatic diseases. Artificial intelligence has been pointed as one solution to automatize diagnosis. However, public ERCP datasets are scarce, which limits the use of such approach. Therefore, this study aims to help fill this gap by providing a large and curated dataset. The collection is composed of 19.018 raw images and 19.317 processed from 1.602 patients. 5.519 images are labeled, which provides a ready to use dataset. All images were manually inspected and annotated by two gastroenterologist with more than 5 years of experience and reviewed by another gastroenterologist with more than 20 years of experience, all with more than 400 ERCP procedures annually. The utility and validity of the dataset is proven by a classification experiment. This collection aims to provide or contribute for a benchmark in automatic ERCP analysis and diagnosis of biliary and pancreatic diseases.
IVFeb 27, 2024
How we won BraTS 2023 Adult Glioma challenge? Just faking it! Enhanced Synthetic Data Augmentation and Model Ensemble for brain tumour segmentationAndré Ferreira, Naida Solak, Jianning Li et al.
Deep Learning is the state-of-the-art technology for segmenting brain tumours. However, this requires a lot of high-quality data, which is difficult to obtain, especially in the medical field. Therefore, our solutions address this problem by using unconventional mechanisms for data augmentation. Generative adversarial networks and registration are used to massively increase the amount of available samples for training three different deep learning models for brain tumour segmentation, the first task of the BraTS2023 challenge. The first model is the standard nnU-Net, the second is the Swin UNETR and the third is the winning solution of the BraTS 2021 Challenge. The entire pipeline is built on the nnU-Net implementation, except for the generation of the synthetic data. The use of convolutional algorithms and transformers is able to fill each other's knowledge gaps. Using the new metric, our best solution achieves the dice results 0.9005, 0.8673, 0.8509 and HD95 14.940, 14.467, 17.699 (whole tumour, tumour core and enhancing tumour) in the validation set.
LGApr 7, 2025
A Simultaneous Approach for Training Neural Differential-Algebraic Systems of EquationsLaurens R. Lueg, Victor Alves, Daniel Schicksnus et al.
Scientific machine learning is an emerging field that broadly describes the combination of scientific computing and machine learning to address challenges in science and engineering. Within the context of differential equations, this has produced highly influential methods, such as neural ordinary differential equations (NODEs). Recent works extend this line of research to consider neural differential-algebraic systems of equations (DAEs), where some unknown relationships within the DAE are learned from data. Training neural DAEs, similarly to neural ODEs, is computationally expensive, as it requires the solution of a DAE for every parameter update. Further, the rigorous consideration of algebraic constraints is difficult within common deep learning training algorithms such as stochastic gradient descent. In this work, we apply the simultaneous approach to neural DAE problems, resulting in a fully discretized nonlinear optimization problem, which is solved to local optimality and simultaneously obtains the neural network parameters and the solution to the corresponding DAE. We extend recent work demonstrating the simultaneous approach for neural ODEs, by presenting a general framework to solve neural DAEs, with explicit consideration of hybrid models, where some components of the DAE are known, e.g. physics-informed constraints. Furthermore, we present a general strategy for improving the performance and convergence of the nonlinear programming solver, based on solving an auxiliary problem for initialization and approximating Hessian terms. We achieve promising results in terms of accuracy, model generalizability and computational cost, across different problem settings such as sparse data, unobserved states and multiple trajectories. Lastly, we provide several promising future directions to improve the scalability and robustness of our approach.
IVNov 22, 2024
Comparative Analysis of nnUNet and MedNeXt for Head and Neck Tumor Segmentation in MRI-guided RadiotherapyNikoo Moradi, André Ferreira, Behrus Puladi et al.
Radiation therapy (RT) is essential in treating head and neck cancer (HNC), with magnetic resonance imaging(MRI)-guided RT offering superior soft tissue contrast and functional imaging. However, manual tumor segmentation is time-consuming and complex, and therfore remains a challenge. In this study, we present our solution as team TUMOR to the HNTS-MRG24 MICCAI Challenge which is focused on automated segmentation of primary gross tumor volumes (GTVp) and metastatic lymph node gross tumor volume (GTVn) in pre-RT and mid-RT MRI images. We utilized the HNTS-MRG2024 dataset, which consists of 150 MRI scans from patients diagnosed with HNC, including original and registered pre-RT and mid-RT T2-weighted images with corresponding segmentation masks for GTVp and GTVn. We employed two state-of-the-art models in deep learning, nnUNet and MedNeXt. For Task 1, we pretrained models on pre-RT registered and mid-RT images, followed by fine-tuning on original pre-RT images. For Task 2, we combined registered pre-RT images, registered pre-RT segmentation masks, and mid-RT data as a multi-channel input for training. Our solution for Task 1 achieved 1st place in the final test phase with an aggregated Dice Similarity Coefficient of 0.8254, and our solution for Task 2 ranked 8th with a score of 0.7005. The proposed solution is publicly available at Github Repository.
IVFeb 6, 2024
Deep PCCT: Photon Counting Computed Tomography Deep Learning Applications ReviewAna Carolina Alves, André Ferreira, Gijs Luijten et al.
Medical imaging faces challenges such as limited spatial resolution, interference from electronic noise and poor contrast-to-noise ratios. Photon Counting Computed Tomography (PCCT) has emerged as a solution, addressing these issues with its innovative technology. This review delves into the recent developments and applications of PCCT in pre-clinical research, emphasizing its potential to overcome traditional imaging limitations. For example PCCT has demonstrated remarkable efficacy in improving the detection of subtle abnormalities in breast, providing a level of detail previously unattainable. Examining the current literature on PCCT, it presents a comprehensive analysis of the technology, highlighting the main features of scanners and their varied applications. In addition, it explores the integration of deep learning into PCCT, along with the study of radiomic features, presenting successful applications in data processing. While acknowledging these advances, it also discusses the existing challenges in this field, paving the way for future research and improvements in medical imaging technologies. Despite the limited number of articles on this subject, due to the recent integration of PCCT at a clinical level, its potential benefits extend to various diagnostic applications.
SEJul 4, 2025
Is It Time To Treat Prompts As Code? A Multi-Use Case Study For Prompt Optimization Using DSPyFrancisca Lemos, Victor Alves, Filipa Ferraz
Although prompt engineering is central to unlocking the full potential of Large Language Models (LLMs), crafting effective prompts remains a time-consuming trial-and-error process that relies on human intuition. This study investigates Declarative Self-improving Python (DSPy), an optimization framework that programmatically creates and refines prompts, applied to five use cases: guardrail enforcement, hallucination detection in code, code generation, routing agents, and prompt evaluation. Each use case explores how prompt optimization via DSPy influences performance. While some cases demonstrated modest improvements - such as minor gains in the guardrails use case and selective enhancements in hallucination detection - others showed notable benefits. The prompt evaluation criterion task demonstrated a substantial performance increase, rising accuracy from 46.2% to 64.0%. In the router agent case, the possibility of improving a poorly performing prompt and of a smaller model matching a stronger one through optimized prompting was explored. Although prompt refinement increased accuracy from 85.0% to 90.0%, using the optimized prompt with a cheaper model did not improve performance. Overall, this study's findings suggest that DSPy's systematic prompt optimization can enhance LLM performance, particularly when instruction tuning and example selection are optimized together. However, the impact varies by task, highlighting the importance of evaluating specific use cases in prompt optimization research.
IVJun 13, 2025
Enhancing Privacy: The Utility of Stand-Alone Synthetic CT and MRI for Tumor and Bone SegmentationAndré Ferreira, Kunpeng Xie, Caroline Wilpert et al.
AI requires extensive datasets, while medical data is subject to high data protection. Anonymization is essential, but poses a challenge for some regions, such as the head, as identifying structures overlap with regions of clinical interest. Synthetic data offers a potential solution, but studies often lack rigorous evaluation of realism and utility. Therefore, we investigate to what extent synthetic data can replace real data in segmentation tasks. We employed head and neck cancer CT scans and brain glioma MRI scans from two large datasets. Synthetic data were generated using generative adversarial networks and diffusion models. We evaluated the quality of the synthetic data using MAE, MS-SSIM, Radiomics and a Visual Turing Test (VTT) performed by 5 radiologists and their usefulness in segmentation tasks using DSC. Radiomics indicates high fidelity of synthetic MRIs, but fall short in producing highly realistic CT tissue, with correlation coefficient of 0.8784 and 0.5461 for MRI and CT tumors, respectively. DSC results indicate limited utility of synthetic data: tumor segmentation achieved DSC=0.064 on CT and 0.834 on MRI, while bone segmentation a mean DSC=0.841. Relation between DSC and correlation is observed, but is limited by the complexity of the task. VTT results show synthetic CTs' utility, but with limited educational applications. Synthetic data can be used independently for the segmentation task, although limited by the complexity of the structures to segment. Advancing generative models to better tolerate heterogeneous inputs and learn subtle details is essential for enhancing their realism and expanding their application potential.
IVMar 17, 2025
The Impact of Artificial Intelligence on Emergency Medicine: A Review of Recent AdvancesGustavo Correia, Victor Alves, Paulo Novais
Artificial Intelligence (AI) is revolutionizing emergency medicine by enhancing diagnostic processes and improving patient outcomes. This article provides a review of the current applications of AI in emergency imaging studies, focusing on the last five years of advancements. AI technologies, particularly machine learning and deep learning, are pivotal in interpreting complex imaging data, offering rapid, accurate diagnoses and potentially surpassing traditional diagnostic methods. Studies highlighted within the article demonstrate AI's capabilities in accurately detecting conditions such as fractures, pneumothorax, and pulmonary diseases from various imaging modalities including X-rays, CT scans, and MRIs. Furthermore, AI's ability to predict clinical outcomes like mechanical ventilation needs illustrates its potential in crisis resource optimization. Despite these advancements, the integration of AI into clinical practice presents challenges such as data privacy, algorithmic bias, and the need for extensive validation across diverse settings. This review underscores the transformative potential of AI in emergency settings, advocating for a future where AI and clinical expertise synergize to elevate patient care standards.
IVDec 27, 2021
Generation of Synthetic Rat Brain MRI scans with a 3D Enhanced Alpha-GANAndré Ferreira, Ricardo Magalhães, Sébastien Mériaux et al.
Translational brain research using Magnetic Resonance Imaging (MRI) is becoming increasingly popular as animal models are an essential part of scientific studies and more ultra-high-field scanners are becoming available. Some disadvantages of MRI are the availability of MRI scanners and the time required for a full scanning session (it usually takes over 30 minutes). Privacy laws and the 3Rs ethics rule also make it difficult to create large datasets for training deep learning models. Generative Adversarial Networks (GANs) can perform data augmentation with higher quality than other techniques. In this work, the alpha-GAN architecture is used to test its ability to produce realistic 3D MRI scans of the rat brain. As far as the authors are aware, this is the first time that a GAN-based approach has been used for data augmentation in preclinical data. The generated scans are evaluated using various qualitative and quantitative metrics. A Turing test conducted by 4 experts has shown that the generated scans can trick almost any expert. The generated scans were also used to evaluate their impact on the performance of an existing deep learning model developed for segmenting the rat brain into white matter, grey matter and cerebrospinal fluid. The models were compared using the Dice score. The best results for whole brain and white matter segmentation were obtained when 174 real scans and 348 synthetic scans were used, with improvements of 0.0172 and 0.0129, respectively. Using 174 real scans and 87 synthetic scans resulted in improvements of 0.0038 and 0.0764 for grey matter and CSF segmentation, respectively. Thus, by using the proposed new normalisation layer and loss functions, it was possible to improve the realism of the generated rat MRI scans and it was shown that using the generated data improved the segmentation model more than using the conventional data augmentation.
IVJan 2, 2021
Combining unsupervised and supervised learning for predicting the final stroke lesionAdriano Pinto, Sérgio Pereira, Raphael Meier et al.
Predicting the final ischaemic stroke lesion provides crucial information regarding the volume of salvageable hypoperfused tissue, which helps physicians in the difficult decision-making process of treatment planning and intervention. Treatment selection is influenced by clinical diagnosis, which requires delineating the stroke lesion, as well as characterising cerebral blood flow dynamics using neuroimaging acquisitions. Nonetheless, predicting the final stroke lesion is an intricate task, due to the variability in lesion size, shape, location and the underlying cerebral haemodynamic processes that occur after the ischaemic stroke takes place. Moreover, since elapsed time between stroke and treatment is related to the loss of brain tissue, assessing and predicting the final stroke lesion needs to be performed in a short period of time, which makes the task even more complex. Therefore, there is a need for automatic methods that predict the final stroke lesion and support physicians in the treatment decision process. We propose a fully automatic deep learning method based on unsupervised and supervised learning to predict the final stroke lesion after 90 days. Our aim is to predict the final stroke lesion location and extent, taking into account the underlying cerebral blood flow dynamics that can influence the prediction. To achieve this, we propose a two-branch Restricted Boltzmann Machine, which provides specialized data-driven features from different sets of standard parametric Magnetic Resonance Imaging maps. These data-driven feature maps are then combined with the parametric Magnetic Resonance Imaging maps, and fed to a Convolutional and Recurrent Neural Network architecture. We evaluated our proposal on the publicly available ISLES 2017 testing dataset, reaching a Dice score of 0.38, Hausdorff Distance of 29.21 mm, and Average Symmetric Surface Distance of 5.52 mm.
CVJun 19, 2020
Adaptive feature recombination and recalibration for semantic segmentation with Fully Convolutional NetworksSergio Pereira, Adriano Pinto, Joana Amorim et al.
Fully Convolutional Networks have been achieving remarkable results in image semantic segmentation, while being efficient. Such efficiency results from the capability of segmenting several voxels in a single forward pass. So, there is a direct spatial correspondence between a unit in a feature map and the voxel in the same location. In a convolutional layer, the kernel spans over all channels and extracts information from them. We observe that linear recombination of feature maps by increasing the number of channels followed by compression may enhance their discriminative power. Moreover, not all feature maps have the same relevance for the classes being predicted. In order to learn the inter-channel relationships and recalibrate the channels to suppress the less relevant ones, Squeeze and Excitation blocks were proposed in the context of image classification with Convolutional Neural Networks. However, this is not well adapted for segmentation with Fully Convolutional Networks since they segment several objects simultaneously, hence a feature map may contain relevant information only in some locations. In this paper, we propose recombination of features and a spatially adaptive recalibration block that is adapted for semantic segmentation with Fully Convolutional Networks - the SegSE block. Feature maps are recalibrated by considering the cross-channel information together with spatial relevance. Experimental results indicate that Recombination and Recalibration improve the results of a competitive baseline, and generalize across three different problems: brain tumor segmentation, stroke penumbra estimation, and ischemic stroke lesion outcome prediction. The obtained results are competitive or outperform the state of the art in the three applications.
CVSep 25, 2018
Automatic brain tumor grading from MRI data using convolutional neural networks and quality assessmentSergio Pereira, Raphael Meier, Victor Alves et al.
Glioblastoma Multiforme is a high grade, very aggressive, brain tumor, with patients having a poor prognosis. Lower grade gliomas are less aggressive, but they can evolve into higher grade tumors over time. Patient management and treatment can vary considerably with tumor grade, ranging from tumor resection followed by a combined radio- and chemotherapy to a "wait and see" approach. Hence, tumor grading is important for adequate treatment planning and monitoring. The gold standard for tumor grading relies on histopathological diagnosis of biopsy specimens. However, this procedure is invasive, time consuming, and prone to sampling error. Given these disadvantages, automatic tumor grading from widely used MRI protocols would be clinically important, as a way to expedite treatment planning and assessment of tumor evolution. In this paper, we propose to use Convolutional Neural Networks for predicting tumor grade directly from imaging data. In this way, we overcome the need for expert annotations of regions of interest. We evaluate two prediction approaches: from the whole brain, and from an automatically defined tumor region. Finally, we employ interpretability methodologies as a quality assurance stage to check if the method is using image regions indicative of tumor grade for classification.
CVJun 12, 2018
Enhancing clinical MRI Perfusion maps with data-driven maps of complementary nature for lesion outcome predictionAdriano Pinto, Sergio Pereira, Raphael Meier et al.
Stroke is the second most common cause of death in developed countries, where rapid clinical intervention can have a major impact on a patient's life. To perform the revascularization procedure, the decision making of physicians considers its risks and benefits based on multi-modal MRI and clinical experience. Therefore, automatic prediction of the ischemic stroke lesion outcome has the potential to assist the physician towards a better stroke assessment and information about tissue outcome. Typically, automatic methods consider the information of the standard kinetic models of diffusion and perfusion MRI (e.g. Tmax, TTP, MTT, rCBF, rCBV) to perform lesion outcome prediction. In this work, we propose a deep learning method to fuse this information with an automated data selection of the raw 4D PWI image information, followed by a data-driven deep-learning modeling of the underlying blood flow hemodynamics. We demonstrate the ability of the proposed approach to improve prediction of tissue at risk before therapy, as compared to only using the standard clinical perfusion maps, hence suggesting on the potential benefits of the proposed data-driven raw perfusion data modelling approach.
CVJun 6, 2018
Adaptive feature recombination and recalibration for semantic segmentation: application to brain tumor segmentation in MRISérgio Pereira, Victor Alves, Carlos A. Silva
Convolutional neural networks (CNNs) have been successfully used for brain tumor segmentation, specifically, fully convolutional networks (FCNs). FCNs can segment a set of voxels at once, having a direct spatial correspondence between units in feature maps (FMs) at a given location and the corresponding classified voxels. In convolutional layers, FMs are merged to create new FMs, so, channel combination is crucial. However, not all FMs have the same relevance for a given class. Recently, in classification problems, Squeeze-and-Excitation (SE) blocks have been proposed to re-calibrate FMs as a whole, and suppress the less informative ones. However, this is not optimal in FCN due to the spatial correspondence between units and voxels. In this article, we propose feature recombination through linear expansion and compression to create more complex features for semantic segmentation. Additionally, we propose a segmentation SE (SegSE) block for feature recalibration that collects contextual information, while maintaining the spatial meaning. Finally, we evaluate the proposed methods in brain tumor segmentation, using publicly available data.