Nadieh Khalili

h-index9

9papers

368citations

Novelty43%

AI Score42

Ranked #59,194 of 194,257 authors (top 30%)#516 in IV (top 12%)

9 Papers

7.1CVMay 5

DALPHIN: Benchmarking Digital Pathology AI Copilots Against Pathologists on an Open Multicentric Dataset

Carlijn Lems, Sander Moonemans, Natálie Klubíčková et al.

Foundation models with visual question answering capabilities for digital pathology are emerging. Such unprecedented technology requires independent benchmarking to assess its potential in assisting pathologists in routine diagnostics. We created DALPHIN, the first multicentric open benchmark for pathology AI copilots, comprising 1236 images from 300 cases, spanning 130 rare to common diagnoses, 6 countries, and 14 subspecialties. The DALPHIN design and dataset are introduced alongside a human performance benchmark of 31 pathologists from 10 countries with varying expertise. We report results for two general-purpose (GPT-5, Gemini 2.5 Pro) and one pathology-specific copilot (PathChat+) for sequential and independent answer generation. We observed no statistically significant difference from expert-level performance in four of six tasks for PathChat, 2/6 tasks for Gemini, and 1/6 tasks for GPT. DALPHIN is publicly released with sequestered, indirectly accessible ground truth to foster robust and enduring benchmarking. Data, methods, and the evaluation platform are accessible through dalphin.grand-challenge.org.

6.5CVFeb 16, 2024Code

Uncertainty-guided annotation enhances segmentation with the human-in-the-loop

Nadieh Khalili, Joey Spronck, Francesco Ciompi et al.

Deep learning algorithms, often critiqued for their 'black box' nature, traditionally fall short in providing the necessary transparency for trusted clinical use. This challenge is particularly evident when such models are deployed in local hospitals, encountering out-of-domain distributions due to varying imaging techniques and patient-specific pathologies. Yet, this limitation offers a unique avenue for continual learning. The Uncertainty-Guided Annotation (UGA) framework introduces a human-in-the-loop approach, enabling AI to convey its uncertainties to clinicians, effectively acting as an automated quality control mechanism. UGA eases this interaction by quantifying uncertainty at the pixel level, thereby revealing the model's limitations and opening the door for clinician-guided corrections. We evaluated UGA on the Camelyon dataset for lymph node metastasis segmentation which revealed that UGA improved the Dice coefficient (DC), from 0.66 to 0.76 by adding 5 patches, and further to 0.84 with 10 patches. To foster broader application and community contribution, we have made our code accessible at

13.3AINov 6, 2024

Navigating the landscape of multimodal AI in medicine: a scoping review on technical challenges and clinical applications

Daan Schouten, Giulia Nicoletti, Bas Dille et al.

Recent technological advances in healthcare have led to unprecedented growth in patient data quantity and diversity. While artificial intelligence (AI) models have shown promising results in analyzing individual data modalities, there is increasing recognition that models integrating multiple complementary data sources, so-called multimodal AI, could enhance clinical decision-making. This scoping review examines the landscape of deep learning-based multimodal AI applications across the medical domain, analyzing 432 papers published between 2018 and 2024. We provide an extensive overview of multimodal AI development across different medical disciplines, examining various architectural approaches, fusion strategies, and common application areas. Our analysis reveals that multimodal AI models consistently outperform their unimodal counterparts, with an average improvement of 6.2 percentage points in AUC. However, several challenges persist, including cross-departmental coordination, heterogeneous data characteristics, and incomplete datasets. We critically assess the technical and practical challenges in developing multimodal AI systems and discuss potential strategies for their clinical implementation, including a brief overview of commercially available multimodal AI models for clinical decision-making. Additionally, we identify key factors driving multimodal AI development and propose recommendations to accelerate the field's maturation. This review provides researchers and clinicians with a thorough understanding of the current state, challenges, and future directions of multimodal AI in medicine.

21.3IVJul 26, 2021Code

Structure-Preserving Multi-Domain Stain Color Augmentation using Style-Transfer with Disentangled Representations

Sophia J. Wagner, Nadieh Khalili, Raghav Sharma et al.

In digital pathology, different staining procedures and scanners cause substantial color variations in whole-slide images (WSIs), especially across different laboratories. These color shifts result in a poor generalization of deep learning-based methods from the training domain to external pathology data. To increase test performance, stain normalization techniques are used to reduce the variance between training and test domain. Alternatively, color augmentation can be applied during training leading to a more robust model without the extra step of color normalization at test time. We propose a novel color augmentation technique, HistAuGAN, that can simulate a wide variety of realistic histology stain colors, thus making neural networks stain-invariant when applied during training. Based on a generative adversarial network (GAN) for image-to-image translation, our model disentangles the content of the image, i.e., the morphological tissue structure, from the stain color attributes. It can be trained on multiple domains and, therefore, learns to cover different stain colors as well as other domain-specific variations introduced in the slide preparation and imaging process. We demonstrate that HistAuGAN outperforms conventional color augmentation techniques on a classification task on the publicly available dataset Camelyon17 and show that it is able to mitigate present batch effects.

3.6IVNov 10, 2019

Combined analysis of coronary arteries and the left ventricular myocardium in cardiac CT angiography for detection of patients with functionally significant stenosis

Majd Zreik, Tim Leiner, Nadieh Khalili et al.

Treatment of patients with obstructive coronary artery disease is guided by the functional significance of a coronary artery stenosis. Fractional flow reserve (FFR), measured during invasive coronary angiography (ICA), is considered the gold standard to define the functional significance of a coronary stenosis. Here, we present a method for non-invasive detection of patients with functionally significant coronary artery stenosis, combining analysis of the coronary artery tree and the left ventricular (LV) myocardium in cardiac CT angiography (CCTA) images. We retrospectively collected CCTA scans of 126 patients who underwent invasive FFR measurements, to determine the functional significance of coronary stenoses. We combine our previous works for the analysis of the complete coronary artery tree and the LV myocardium: Coronary arteries are encoded by two disjoint convolutional autoencoders (CAEs) and the LV myocardium is characterized by a convolutional neural network (CNN) and a CAE. Thereafter, using the extracted encodings of all coronary arteries and the LV myocardium, patients are classified according to the presence of functionally significant stenosis, as defined by the invasively measured FFR. To handle the varying number of coronary arteries in a patient, the classification is formulated as a multiple instance learning problem and is performed using an attention-based neural network. Cross-validation experiments resulted in an average area under the receiver operating characteristic curve of $0.74 \pm 0.01$, and showed that the proposed combined analysis outperformed the analysis of the coronary arteries or the LV myocardium only. The results demonstrate the feasibility of combining the analyses of the complete coronary artery tree and the LV myocardium in CCTA images for the detection of patients with functionally significant stenosis in coronary arteries.

18.0IVJun 11, 2019

Automatic brain tissue segmentation in fetal MRI using convolutional neural networks

N. Khalili, N. Lessmann, E. Turk et al.

MR images of fetuses allow clinicians to detect brain abnormalities in an early stage of development. The cornerstone of volumetric and morphologic analysis in fetal MRI is segmentation of the fetal brain into different tissue classes. Manual segmentation is cumbersome and time consuming, hence automatic segmentation could substantially simplify the procedure. However, automatic brain tissue segmentation in these scans is challenging owing to artifacts including intensity inhomogeneity, caused in particular by spontaneous fetal movements during the scan. Unlike methods that estimate the bias field to remove intensity inhomogeneity as a preprocessing step to segmentation, we propose to perform segmentation using a convolutional neural network that exploits images with synthetically introduced intensity inhomogeneity as data augmentation. The method first uses a CNN to extract the intracranial volume. Thereafter, another CNN with the same architecture is employed to segment the extracted volume into seven brain tissue classes: cerebellum, basal ganglia and thalami, ventricular cerebrospinal fluid, white matter, brain stem, cortical gray matter and extracerebral cerebrospinal fluid. To make the method applicable to slices showing intensity inhomogeneity artifacts, the training data was augmented by applying a combination of linear gradients with random offsets and orientations to image slices without artifacts.

6.3IVJun 11, 2019

Generative adversarial network for segmentation of motion affected neonatal brain MRI

N. Khalili, E. Turk, M. Zreik et al.

Automatic neonatal brain tissue segmentation in preterm born infants is a prerequisite for evaluation of brain development. However, automatic segmentation is often hampered by motion artifacts caused by infant head movements during image acquisition. Methods have been developed to remove or minimize these artifacts during image reconstruction using frequency domain data. However, frequency domain data might not always be available. Hence, in this study we propose a method for removing motion artifacts from the already reconstructed MR scans. The method employs a generative adversarial network trained with a cycle consistency loss to transform slices affected by motion into slices without motion artifacts, and vice versa. In the experiments 40 T2-weighted coronal MR scans of preterm born infants imaged at 30 weeks postmenstrual age were used. All images contained slices affected by motion artifacts hampering automatic tissue segmentation. To evaluate whether correction allows more accurate image segmentation, the images were segmented into 8 tissue classes: cerebellum, myelinated white matter, basal ganglia and thalami, ventricular cerebrospinal fluid, white matter, brain stem, cortical gray matter, and extracerebral cerebrospinal fluid. Images corrected for motion and corresponding segmentations were qualitatively evaluated using 5-point Likert scale. Before the correction of motion artifacts, median image quality and quality of corresponding automatic segmentations were assigned grade 2 (poor) and 3 (moderate), respectively. After correction of motion artifacts, both improved to grades 3 and 4, respectively. The results indicate that correction of motion artifacts in the image space using the proposed approach allows accurate segmentation of brain tissue classes in slices affected by motion artifacts.

6.3IVJun 11, 2019

Deep learning analysis of coronary arteries in cardiac CT angiography for detection of patients requiring invasive coronary angiography

Majd Zreik, Robbert W. van Hamersvelt, Nadieh Khalili et al.

In patients with obstructive coronary artery disease, the functional significance of a coronary artery stenosis needs to be determined to guide treatment. This is typically established through fractional flow reserve (FFR) measurement, performed during invasive coronary angiography (ICA). We present a method for automatic and non-invasive detection of patients requiring ICA, employing deep unsupervised analysis of complete coronary arteries in cardiac CT angiography (CCTA) images. We retrospectively collected CCTA scans of 187 patients, 137 of them underwent invasive FFR measurement in 192 different coronary arteries. These FFR measurements served as a reference standard for the functional significance of the coronary stenosis. The centerlines of the coronary arteries were extracted and used to reconstruct straightened multi-planar reformatted (MPR) volumes. To automatically identify arteries with functionally significant stenosis that require ICA, each MPR volume was encoded into a fixed number of encodings using two disjoint 3D and 1D convolutional autoencoders performing spatial and sequential encodings, respectively. Thereafter, these encodings were employed to classify arteries using a support vector machine classifier. The detection of coronary arteries requiring invasive evaluation, evaluated using repeated cross-validation experiments, resulted in an area under the receiver operating characteristic curve of $0.81 \pm 0.02$ on the artery-level, and $0.87 \pm 0.02$ on the patient-level. The results demonstrate the feasibility of automatic non-invasive detection of patients that require ICA and possibly subsequent coronary artery intervention. This could potentially reduce the number of patients that unnecessarily undergo ICA.

3.8CVJul 31, 2017

Automatic segmentation of the intracranialvolume in fetal MR images

N. Khalili, P. Moeskops, N. H. P. Claessens et al.

MR images of the fetus allow non-invasive analysis of the fetal brain. Quantitative analysis of fetal brain development requires automatic brain tissue segmentation that is typically preceded by segmentation of the intracranial volume (ICV). This is challenging because fetal MR images visualize the whole moving fetus and in addition partially visualize the maternal body. This paper presents an automatic method for segmentation of the ICV in fetal MR images. The method employs a multi-scale convolutional neural network in 2D slices to enable learning spatial information from larger context as well as detailed local information. The method is developed and evaluated with 30 fetal T2-weighted MRI scans (average age $33.2\pm1.2$ weeks postmenstrual age). The set contains $10$ scans acquired in axial, $10$ in coronal and $10$ in sagittal imaging planes. A reference standard was defined in all images by manual annotation of the intracranial volume in $10$ equidistantly distributed slices. The automatic analysis was performed by training and testing the network using scans acquired in the representative imaging plane as well as combining the training data from all imaging planes. On average, the automatic method achieved Dice coefficients of 0.90 for the axial images, 0.90 for the coronal images and 0.92 for the sagittal images. Combining the training sets resulted in average Dice coefficients of 0.91 for the axial images, 0.95 for the coronal images, and 0.92 for the sagittal images. The results demonstrate that the evaluated method achieved good performance in extracting ICV in fetal MR scans regardless of the imaging plane.