Cosmin I. Bercea

IV
h-index69
23papers
234citations
Novelty49%
AI Score56

23 Papers

IVJul 4, 2024Code
Unsupervised Analysis of Alzheimer's Disease Signatures using 3D Deformable Autoencoders

Mehmet Yigit Avci, Emily Chan, Veronika Zimmer et al.

With the increasing incidence of neurodegenerative diseases such as Alzheimer's Disease (AD), there is a need for further research that enhances detection and monitoring of the diseases. We present MORPHADE (Morphological Autoencoders for Alzheimer's Disease Detection), a novel unsupervised learning approach which uses deformations to allow the analysis of 3D T1-weighted brain images. To the best of our knowledge, this is the first use of deformations with deep unsupervised learning to not only detect, but also localize and assess the severity of structural changes in the brain due to AD. We obtain markedly higher anomaly scores in clinically important areas of the brain in subjects with AD compared to healthy controls, showcasing that our method is able to effectively locate AD-related atrophy. We additionally observe a visual correlation between the severity of atrophy highlighted in our anomaly maps and medial temporal lobe atrophy scores evaluated by a clinical expert. Finally, our method achieves an AUROC of 0.80 in detecting AD, out-performing several supervised and unsupervised baselines. We believe our framework shows promise as a tool towards improved understanding, monitoring and detection of AD. To support further research and application, we have made our code publicly available at github.com/ci-ber/MORPHADE.

CVJun 8, 2022
What do we learn? Debunking the Myth of Unsupervised Outlier Detection

Cosmin I. Bercea, Daniel Rueckert, Julia A. Schnabel

Even though auto-encoders (AEs) have the desirable property of learning compact representations without labels and have been widely applied to out-of-distribution (OoD) detection, they are generally still poorly understood and are used incorrectly in detecting outliers where the normal and abnormal distributions are strongly overlapping. In general, the learned manifold is assumed to contain key information that is only important for describing samples within the training distribution, and that the reconstruction of outliers leads to high residual errors. However, recent work suggests that AEs are likely to be even better at reconstructing some types of OoD samples. In this work, we challenge this assumption and investigate what auto-encoders actually learn when they are posed to solve two different tasks. First, we propose two metrics based on the Fréchet inception distance (FID) and confidence scores of a trained classifier to assess whether AEs can learn the training distribution and reliably recognize samples from other domains. Second, we investigate whether AEs are able to synthesize normal images from samples with abnormal regions, on a more challenging lung pathology detection task. We have found that state-of-the-art (SOTA) AEs are either unable to constrain the latent manifold and allow reconstruction of abnormal patterns, or they are failing to accurately restore the inputs from their latent distribution, resulting in blurred or misaligned reconstructions. We propose novel deformable auto-encoders (MorphAEus) to learn perceptually aware global image priors and locally adapt their morphometry based on estimated dense deformation fields. We demonstrate superior performance over unsupervised methods in detecting OoD and pathology.

IVAug 26, 2023
Bias in Unsupervised Anomaly Detection in Brain MRI

Cosmin I. Bercea, Esther Puyol-Antón, Benedikt Wiestler et al.

Unsupervised anomaly detection methods offer a promising and flexible alternative to supervised approaches, holding the potential to revolutionize medical scan analysis and enhance diagnostic performance. In the current landscape, it is commonly assumed that differences between a test case and the training distribution are attributed solely to pathological conditions, implying that any disparity indicates an anomaly. However, the presence of other potential sources of distributional shift, including scanner, age, sex, or race, is frequently overlooked. These shifts can significantly impact the accuracy of the anomaly detection task. Prominent instances of such failures have sparked concerns regarding the bias, credibility, and fairness of anomaly detection. This work presents a novel analysis of biases in unsupervised anomaly detection. By examining potential non-pathological distributional shifts between the training and testing distributions, we shed light on the extent of these biases and their influence on anomaly detection results. Moreover, this study examines the algorithmic limitations that arise due to biases, providing valuable insights into the challenges encountered by anomaly detection algorithms in accurately learning and capturing the entire range of variability present in the normative distribution. Through this analysis, we aim to enhance the understanding of these biases and pave the way for future improvements in the field. Here, we specifically investigate Alzheimer's disease detection from brain MR imaging as a case study, revealing significant biases related to sex, race, and scanner variations that substantially impact the results. These findings align with the broader goal of improving the reliability, fairness, and effectiveness of anomaly detection in medical imaging.

CLNov 6, 2025Code
Learning to reason about rare diseases through retrieval-augmented agents

Ha Young Kim, Jun Li, Ana Beatriz Solana et al.

Rare diseases represent the long tail of medical imaging, where AI models often fail due to the scarcity of representative training data. In clinical workflows, radiologists frequently consult case reports and literature when confronted with unfamiliar findings. Following this line of reasoning, we introduce RADAR, Retrieval Augmented Diagnostic Reasoning Agents, an agentic system for rare disease detection in brain MRI. Our approach uses AI agents with access to external medical knowledge by embedding both case reports and literature using sentence transformers and indexing them with FAISS to enable efficient similarity search. The agent retrieves clinically relevant evidence to guide diagnostic decision making on unseen diseases, without the need of additional training. Designed as a model-agnostic reasoning module, RADAR can be seamlessly integrated with diverse large language models, consistently improving their rare pathology recognition and interpretability. On the NOVA dataset comprising 280 distinct rare diseases, RADAR achieves up to a 10.2% performance gain, with the strongest improvements observed for open source models such as DeepSeek. Beyond accuracy, the retrieved examples provide interpretable, literature grounded explanations, highlighting retrieval-augmented reasoning as a powerful paradigm for low-prevalence conditions in medical imaging.

CLApr 27Code
Dynamic Decision Learning: Test-Time Evolution for Abnormality Grounding in Rare Diseases

Jun Li, Mingxuan Liu, Jiazhen Pan et al.

Clinical abnormality grounding for rare diseases is often hindered by data scarcity, making supervised fine-tuning impractical and single-pass inference highly unstable. We propose Dynamic Decision Learning (DDL), a framework that enables frozen large vision-language models (LVLMs) to refine their decisions across both language and visual spaces by optimizing instructions and consolidating predictions under visual perturbations. This process improves localization quality and produces a consensus-based reliability score that quantifies model confidence. Results on brain imaging benchmarks, including a rare-disease dataset with 281 pathology types across models ranging from 3B to 72B parameters, show that DDL improves mAP@75 by up to 105% on rare-disease cases and outperforms adaptation baselines and supervised fine-tuning. Furthermore, DDL demonstrates stronger calibration between reliability scores and localization accuracy under severe distribution shifts and increasing task difficulty. Code is available at: https://lijunrio.github.io/DDL/

IVNov 12, 2025Code
TomoGraphView: 3D Medical Image Classification with Omnidirectional Slice Representations and Graph Neural Networks

Johannes Kiechle, Stefan M. Fischer, Daniel M. Lang et al.

The growing number of medical tomography examinations has necessitated the development of automated methods capable of extracting comprehensive imaging features to facilitate downstream tasks such as tumor characterization, while assisting physicians in managing their growing workload. However, 3D medical image classification remains a challenging task due to the complex spatial relationships and long-range dependencies inherent in volumetric data. Training models from scratch suffers from low data regimes, and the absence of 3D large-scale multimodal datasets has limited the development of 3D medical imaging foundation models. Recent studies, however, have highlighted the potential of 2D vision foundation models, originally trained on natural images, as powerful feature extractors for medical image analysis. Despite these advances, existing approaches that apply 2D models to 3D volumes via slice-based decomposition remain suboptimal. Conventional volume slicing strategies, which rely on canonical planes such as axial, sagittal, or coronal, may inadequately capture the spatial extent of target structures when these are misaligned with standardized viewing planes. Furthermore, existing slice-wise aggregation strategies rarely account for preserving the volumetric structure, resulting in a loss of spatial coherence across slices. To overcome these limitations, we propose TomoGraphView, a novel framework that integrates omnidirectional volume slicing with spherical graph-based feature aggregation. We publicly share our accessible code base at http://github.com/compai-lab/2025-MedIA-kiechle and provide a user-friendly library for omnidirectional volume slicing at https://pypi.org/project/OmniSlicer.

IVJul 21, 2024
MedEdit: Counterfactual Diffusion-based Image Editing on Brain MRI

Malek Ben Alaya, Daniel M. Lang, Benedikt Wiestler et al.

Denoising diffusion probabilistic models enable high-fidelity image synthesis and editing. In biomedicine, these models facilitate counterfactual image editing, producing pairs of images where one is edited to simulate hypothetical conditions. For example, they can model the progression of specific diseases, such as stroke lesions. However, current image editing techniques often fail to generate realistic biomedical counterfactuals, either by inadequately modeling indirect pathological effects like brain atrophy or by excessively altering the scan, which disrupts correspondence to the original images. Here, we propose MedEdit, a conditional diffusion model for medical image editing. MedEdit induces pathology in specific areas while balancing the modeling of disease effects and preserving the integrity of the original scan. We evaluated MedEdit on the Atlas v2.0 stroke dataset using Frechet Inception Distance and Dice scores, outperforming state-of-the-art diffusion-based methods such as Palette (by 45%) and SDEdit (by 61%). Additionally, clinical evaluations by a board-certified neuroradiologist confirmed that MedEdit generated realistic stroke scans indistinguishable from real ones. We believe this work will enable counterfactual image editing research to further advance the development of realistic and clinically useful imaging tools.

IVMar 13, 2024Code
Diffusion Models with Implicit Guidance for Medical Anomaly Detection

Cosmin I. Bercea, Benedikt Wiestler, Daniel Rueckert et al.

Diffusion models have advanced unsupervised anomaly detection by improving the transformation of pathological images into pseudo-healthy equivalents. Nonetheless, standard approaches may compromise critical information during pathology removal, leading to restorations that do not align with unaffected regions in the original scans. Such discrepancies can inadvertently increase false positive rates and reduce specificity, complicating radiological evaluations. This paper introduces Temporal Harmonization for Optimal Restoration (THOR), which refines the de-noising process by integrating implicit guidance through temporal anomaly maps. THOR aims to preserve the integrity of healthy tissue in areas unaffected by pathology. Comparative evaluations show that THOR surpasses existing diffusion-based methods in detecting and segmenting anomalies in brain MRIs and wrist X-rays. Code: https://github.com/ci-ber/THOR_DDPM.

CVApr 11, 2024Code
Language Models Meet Anomaly Detection for Better Interpretability and Generalizability

Jun Li, Su Hwan Kim, Philip Müller et al.

This research explores the integration of language models and unsupervised anomaly detection in medical imaging, addressing two key questions: (1) Can language models enhance the interpretability of anomaly detection maps? and (2) Can anomaly maps improve the generalizability of language models in open-set anomaly detection tasks? To investigate these questions, we introduce a new dataset for multi-image visual question-answering on brain magnetic resonance images encompassing multiple conditions. We propose KQ-Former (Knowledge Querying Transformer), which is designed to optimally align visual and textual information in limited-sample contexts. Our model achieves a 60.81% accuracy on closed questions, covering disease classification and severity across 15 different classes. For open questions, KQ-Former demonstrates a 70% improvement over the baseline with a BLEU-4 score of 0.41, and achieves the highest entailment ratios (up to 71.9%) and lowest contradiction ratios (down to 10.0%) among various natural language inference models. Furthermore, integrating anomaly maps results in an 18% accuracy increase in detecting open-set anomalies, thereby enhancing the language model's generalizability to previously unseen medical conditions. The code and dataset are available at https://github.com/compai-lab/miccai-2024-junli?tab=readme-ov-file

AIJan 21
Measuring and Aligning Abstraction in Vision-Language Models with Medical Taxonomies

Ben Schaper, Maxime Di Folco, Bernhard Kainz et al.

Vision-Language Models show strong zero-shot performance for chest X-ray classification, but standard flat metrics fail to distinguish between clinically minor and severe errors. This work investigates how to quantify and mitigate abstraction errors by leveraging medical taxonomies. We benchmark several state-of-the-art VLMs using hierarchical metrics and introduce Catastrophic Abstraction Errors to capture cross-branch mistakes. Our results reveal substantial misalignment of VLMs with clinical taxonomies despite high flat performance. To address this, we propose risk-constrained thresholding and taxonomy-aware fine-tuning with radial embeddings, which reduce severe abstraction errors to below 2 per cent while maintaining competitive performance. These findings highlight the importance of hierarchical evaluation and representation-level alignment for safer and more clinically meaningful deployment of VLMs.

IVJan 19, 2024Code
Towards Universal Unsupervised Anomaly Detection in Medical Imaging

Cosmin I. Bercea, Benedikt Wiestler, Daniel Rueckert et al.

The increasing complexity of medical imaging data underscores the need for advanced anomaly detection methods to automatically identify diverse pathologies. Current methods face challenges in capturing the broad spectrum of anomalies, often limiting their use to specific lesion types in brain scans. To address this challenge, we introduce a novel unsupervised approach, termed \textit{Reversed Auto-Encoders (RA)}, designed to create realistic pseudo-healthy reconstructions that enable the detection of a wider range of pathologies. We evaluate the proposed method across various imaging modalities, including magnetic resonance imaging (MRI) of the brain, pediatric wrist X-ray, and chest X-ray, and demonstrate superior performance in detecting anomalies compared to existing state-of-the-art methods. Our unsupervised anomaly detection approach may enhance diagnostic accuracy in medical imaging by identifying a broader range of unknown pathologies. Our code is publicly available at: \url{https://github.com/ci-ber/RA}.

IVMar 21, 2024
Denoising Diffusion Models for 3D Healthy Brain Tissue Inpainting

Alicia Durrer, Julia Wolleb, Florentin Bieder et al.

Monitoring diseases that affect the brain's structural integrity requires automated analysis of magnetic resonance (MR) images, e.g., for the evaluation of volumetric changes. However, many of the evaluation tools are optimized for analyzing healthy tissue. To enable the evaluation of scans containing pathological tissue, it is therefore required to restore healthy tissue in the pathological areas. In this work, we explore and extend denoising diffusion models for consistent inpainting of healthy 3D brain tissue. We modify state-of-the-art 2D, pseudo-3D, and 3D methods working in the image space, as well as 3D latent and 3D wavelet diffusion models, and train them to synthesize healthy brain tissue. Our evaluation shows that the pseudo-3D model performs best regarding the structural-similarity index, peak signal-to-noise ratio, and mean squared error. To emphasize the clinical relevance, we fine-tune this model on data containing synthetic MS lesions and evaluate it on a downstream brain tissue segmentation task, whereby it outperforms the established FMRIB Software Library (FSL) lesion-filling method.

CVSep 8, 2025
Does DINOv3 Set a New Medical Vision Standard?

Che Liu, Yinda Chen, Haoyuan Shi et al.

The advent of large-scale vision foundation models, pre-trained on diverse natural images, has marked a paradigm shift in computer vision. However, how the frontier vision foundation models' efficacies transfer to specialized domains remains such as medical imaging remains an open question. This report investigates whether DINOv3, a state-of-the-art self-supervised vision transformer (ViT) that features strong capability in dense prediction tasks, can directly serve as a powerful, unified encoder for medical vision tasks without domain-specific pre-training. To answer this, we benchmark DINOv3 across common medical vision tasks, including 2D/3D classification and segmentation on a wide range of medical imaging modalities. We systematically analyze its scalability by varying model sizes and input image resolutions. Our findings reveal that DINOv3 shows impressive performance and establishes a formidable new baseline. Remarkably, it can even outperform medical-specific foundation models like BiomedCLIP and CT-Net on several tasks, despite being trained solely on natural images. However, we identify clear limitations: The model's features degrade in scenarios requiring deep domain specialization, such as in Whole-Slide Pathological Images (WSIs), Electron Microscopy (EM), and Positron Emission Tomography (PET). Furthermore, we observe that DINOv3 does not consistently obey scaling law in the medical domain; performance does not reliably increase with larger models or finer feature resolutions, showing diverse scaling behaviors across tasks. Ultimately, our work establishes DINOv3 as a strong baseline, whose powerful visual features can serve as a robust prior for multiple complex medical tasks. This opens promising future directions, such as leveraging its features to enforce multiview consistency in 3D reconstruction.

IVMay 20, 2025
NOVA: A Benchmark for Anomaly Localization and Clinical Reasoning in Brain MRI

Cosmin I. Bercea, Jun Li, Philipp Raffler et al.

In many real-world applications, deployed models encounter inputs that differ from the data seen during training. Out-of-distribution detection identifies whether an input stems from an unseen distribution, while open-world recognition flags such inputs to ensure the system remains robust as ever-emerging, previously $unknown$ categories appear and must be addressed without retraining. Foundation and vision-language models are pre-trained on large and diverse datasets with the expectation of broad generalization across domains, including medical imaging. However, benchmarking these models on test sets with only a few common outlier types silently collapses the evaluation back to a closed-set problem, masking failures on rare or truly novel conditions encountered in clinical use. We therefore present $NOVA$, a challenging, real-life $evaluation-only$ benchmark of $\sim$900 brain MRI scans that span 281 rare pathologies and heterogeneous acquisition protocols. Each case includes rich clinical narratives and double-blinded expert bounding-box annotations. Together, these enable joint assessment of anomaly localisation, visual captioning, and diagnostic reasoning. Because NOVA is never used for training, it serves as an $extreme$ stress-test of out-of-distribution generalisation: models must bridge a distribution gap both in sample appearance and in semantic space. Baseline results with leading vision-language models (GPT-4o, Gemini 2.0 Flash, and Qwen2.5-VL-72B) reveal substantial performance drops across all tasks, establishing NOVA as a rigorous testbed for advancing models that can detect, localize, and reason about truly unknown anomalies.

CVMar 5, 2025
Enhancing Abnormality Grounding for Vision Language Models with Knowledge Descriptions

Jun Li, Che Liu, Wenjia Bai et al.

Visual Language Models (VLMs) have demonstrated impressive capabilities in visual grounding tasks. However, their effectiveness in the medical domain, particularly for abnormality detection and localization within medical images, remains underexplored. A major challenge is the complex and abstract nature of medical terminology, which makes it difficult to directly associate pathological anomaly terms with their corresponding visual features. In this work, we introduce a novel approach to enhance VLM performance in medical abnormality detection and localization by leveraging decomposed medical knowledge. Instead of directly prompting models to recognize specific abnormalities, we focus on breaking down medical concepts into fundamental attributes and common visual patterns. This strategy promotes a stronger alignment between textual descriptions and visual features, improving both the recognition and localization of abnormalities in medical images.We evaluate our method on the 0.23B Florence-2 base model and demonstrate that it achieves comparable performance in abnormality grounding to significantly larger 7B LLaVA-based medical VLMs, despite being trained on only 1.5% of the data used for such models. Experimental results also demonstrate the effectiveness of our approach in both known and previously unseen abnormalities, suggesting its strong generalization capabilities.

IVOct 31, 2024
Denoising Diffusion Models for Anomaly Localization in Medical Images

Cosmin I. Bercea, Philippe C. Cattin, Julia A. Schnabel et al.

This chapter explores anomaly localization in medical images using denoising diffusion models. After providing a brief methodological background of these models, including their application to image reconstruction and their conditioning using guidance mechanisms, we provide an overview of available datasets and evaluation metrics suitable for their application to anomaly localization in medical images. In this context, we discuss supervision schemes ranging from fully supervised segmentation to semi-supervised, weakly supervised, self-supervised, and unsupervised methods, and provide insights into the effectiveness and limitations of these approaches. Furthermore, we highlight open challenges in anomaly localization, including detection bias, domain shift, computational cost, and model interpretability. Our goal is to provide an overview of the current state of the art in the field, outline research gaps, and highlight the potential of diffusion models for robust anomaly localization in medical images.

CVSep 22, 2025
Influence of Classification Task and Distribution Shift Type on OOD Detection in Fetal Ultrasound

Chun Kit Wong, Anders N. Christensen, Cosmin I. Bercea et al.

Reliable out-of-distribution (OOD) detection is important for safe deployment of deep learning models in fetal ultrasound amidst heterogeneous image characteristics and clinical settings. OOD detection relies on estimating a classification model's uncertainty, which should increase for OOD samples. While existing research has largely focused on uncertainty quantification methods, this work investigates the impact of the classification task itself. Through experiments with eight uncertainty quantification methods across four classification tasks, we demonstrate that OOD detection performance significantly varies with the task, and that the best task depends on the defined ID-OOD criteria; specifically, whether the OOD sample is due to: i) an image characteristic shift or ii) an anatomical feature shift. Furthermore, we reveal that superior OOD detection does not guarantee optimal abstained prediction, underscoring the necessity to align task selection and uncertainty strategies with the specific downstream application in medical image analysis.

CVAug 6, 2025
Knowledge to Sight: Reasoning over Visual Attributes via Knowledge Decomposition for Abnormality Grounding

Jun Li, Che Liu, Wenjia Bai et al.

In this work, we address the problem of grounding abnormalities in medical images, where the goal is to localize clinical findings based on textual descriptions. While generalist Vision-Language Models (VLMs) excel in natural grounding tasks, they often struggle in the medical domain due to rare, compositional, and domain-specific terms that are poorly aligned with visual patterns. Specialized medical VLMs address this challenge via large-scale domain pretraining, but at the cost of substantial annotation and computational resources. To overcome these limitations, we propose \textbf{Knowledge to Sight (K2Sight)}, a framework that introduces structured semantic supervision by decomposing clinical concepts into interpretable visual attributes, such as shape, density, and anatomical location. These attributes are distilled from domain ontologies and encoded into concise instruction-style prompts, which guide region-text alignment during training. Unlike conventional report-level supervision, our approach explicitly bridges domain knowledge and spatial structure, enabling data-efficient training of compact models. We train compact models with 0.23B and 2B parameters using only 1.5\% of the data required by state-of-the-art medical VLMs. Despite their small size and limited training data, these models achieve performance on par with or better than 7B+ medical VLMs, with up to 9.82\% improvement in $mAP_{50}$. Code and models: \href{https://lijunrio.github.io/K2Sight/}{\textcolor{SOTAPink}{https://lijunrio.github.io/K2Sight/}}.

CVMar 6, 2025
Semantic Alignment of Unimodal Medical Text and Vision Representations

Maxime Di Folco, Emily Chan, Marta Hasny et al.

General-purpose AI models, particularly those designed for text and vision, demonstrate impressive versatility across a wide range of deep-learning tasks. However, they often underperform in specialised domains like medical imaging, where domain-specific solutions or alternative knowledge transfer approaches are typically required. Recent studies have noted that general-purpose models can exhibit similar latent spaces when processing semantically related data, although this alignment does not occur naturally. Building on this insight, it has been shown that applying a simple transformation - at most affine - estimated from a subset of semantically corresponding samples, known as anchors, enables model stitching across diverse training paradigms, architectures, and modalities. In this paper, we explore how semantic alignment - estimating transformations between anchors - can bridge general-purpose AI with specialised medical knowledge. Using multiple public chest X-ray datasets, we demonstrate that model stitching across model architectures allows general models to integrate domain-specific knowledge without additional training, leading to improved performance on medical tasks. Furthermore, we introduce a novel zero-shot classification approach for unimodal vision encoders that leverages semantic alignment across modalities. Our results show that our method not only outperforms general multimodal models but also approaches the performance levels of fully trained, medical-specific multimodal solutions

IVJun 12, 2024
Interpretable Representation Learning of Cardiac MRI via Attribute Regularization

Maxime Di Folco, Cosmin I. Bercea, Emily Chan et al.

Interpretability is essential in medical imaging to ensure that clinicians can comprehend and trust artificial intelligence models. Several approaches have been recently considered to encode attributes in the latent space to enhance its interpretability. Notably, attribute regularization aims to encode a set of attributes along the dimensions of a latent representation. However, this approach is based on Variational AutoEncoder and suffers from blurry reconstruction. In this paper, we propose an Attributed-regularized Soft Introspective Variational Autoencoder that combines attribute regularization of the latent space within the framework of an adversarially trained variational autoencoder. We demonstrate on short-axis cardiac Magnetic Resonance images of the UK Biobank the ability of the proposed method to address blurry reconstruction issues of variational autoencoder methods while preserving the latent space interpretability.

IVDec 14, 2023
Attribute Regularized Soft Introspective Variational Autoencoder for Interpretable Cardiac Disease Classification

Maxime Di Folco, Cosmin I. Bercea, Julia A. Schnabel

Interpretability is essential in medical imaging to ensure that clinicians can comprehend and trust artificial intelligence models. In this paper, we propose a novel interpretable approach that combines attribute regularization of the latent space within the framework of an adversarially trained variational autoencoder. Comparative experiments on a cardiac MRI dataset demonstrate the ability of the proposed method to address blurry reconstruction issues of variational autoencoder methods and improve latent space interpretability. Additionally, our analysis of a downstream task reveals that the classification of cardiac disease using the regularized latent space heavily relies on attribute regularized dimensions, demonstrating great interpretability by connecting the used attributes for prediction with clinical observations.

CVMay 31, 2023
Mask, Stitch, and Re-Sample: Enhancing Robustness and Generalizability in Anomaly Detection through Automatic Diffusion Models

Cosmin I. Bercea, Michael Neumayr, Daniel Rueckert et al.

The introduction of diffusion models in anomaly detection has paved the way for more effective and accurate image reconstruction in pathologies. However, the current limitations in controlling noise granularity hinder diffusion models' ability to generalize across diverse anomaly types and compromise the restoration of healthy tissues. To overcome these challenges, we propose AutoDDPM, a novel approach that enhances the robustness of diffusion models. AutoDDPM utilizes diffusion models to generate initial likelihood maps of potential anomalies and seamlessly integrates them with the original image. Through joint noised distribution re-sampling, AutoDDPM achieves harmonization and in-painting effects. Our study demonstrates the efficacy of AutoDDPM in replacing anomalous regions while preserving healthy tissues, considerably surpassing diffusion models' limitations. It also contributes valuable insights and analysis on the limitations of current diffusion models, promoting robust and interpretable anomaly detection in medical imaging - an essential aspect of building autonomous clinical decision systems with higher interpretability.

IVMar 5, 2021
FedDis: Disentangled Federated Learning for Unsupervised Brain Pathology Segmentation

Cosmin I. Bercea, Benedikt Wiestler, Daniel Rueckert et al.

In recent years, data-driven machine learning (ML) methods have revolutionized the computer vision community by providing novel efficient solutions to many unsolved (medical) image analysis problems. However, due to the increasing privacy concerns and data fragmentation on many different sites, existing medical data are not fully utilized, thus limiting the potential of ML. Federated learning (FL) enables multiple parties to collaboratively train a ML model without exchanging local data. However, data heterogeneity (non-IID) among the distributed clients is yet a challenge. To this end, we propose a novel federated method, denoted Federated Disentanglement (FedDis), to disentangle the parameter space into shape and appearance, and only share the shape parameter with the clients. FedDis is based on the assumption that the anatomical structure in brain MRI images is similar across multiple institutions, and sharing the shape knowledge would be beneficial in anomaly detection. In this paper, we leverage healthy brain scans of 623 subjects from multiple sites with real data (OASIS, ADNI) in a privacy-preserving fashion to learn a model of normal anatomy, that allows to segment abnormal structures. We demonstrate a superior performance of FedDis on real pathological databases containing 109 subjects; two publicly available MS Lesions (MSLUB, MSISBI), and an in-house database with MS and Glioblastoma (MSI and GBI). FedDis achieved an average dice performance of 0.38, outperforming the state-of-the-art (SOTA) auto-encoder by 42% and the SOTA federated method by 11%. Further, we illustrate that FedDis learns a shape embedding that is orthogonal to the appearance and consistent under different intensity augmentations.