Aya Elgebaly

CV
h-index37
3papers
1citation
Novelty50%
AI Score46

3 Papers

CVNov 11, 2025Code
ProSona: Prompt-Guided Personalization for Multi-Expert Medical Image Segmentation

Aya Elgebaly, Nikolaos Delopoulos, Juliane Hörner-Rieber et al.

Automated medical image segmentation suffers from high inter-observer variability, particularly in tasks such as lung nodule delineation, where experts often disagree. Existing approaches either collapse this variability into a consensus mask or rely on separate model branches for each annotator. We introduce ProSona, a two-stage framework that learns a continuous latent space of annotation styles, enabling controllable personalization via natural language prompts. A probabilistic U-Net backbone captures diverse expert hypotheses, while a prompt-guided projection mechanism navigates this latent space to generate personalized segmentations. A multi-level contrastive objective aligns textual and visual representations, promoting disentangled and interpretable expert styles. Across the LIDC-IDRI lung nodule and multi-institutional prostate MRI datasets, ProSona reduces the Generalized Energy Distance by 17% and improves mean Dice by more than one point compared with DPersona. These results demonstrate that natural-language prompts can provide flexible, accurate, and interpretable control over personalized medical image segmentation. Our implementation is available online 1 .

CVJul 24, 2024Code
CoMoTo: Unpaired Cross-Modal Lesion Distillation Improves Breast Lesion Detection in Tomosynthesis

Muhammad Alberb, Marawan Elbatel, Aya Elgebaly et al.

Digital Breast Tomosynthesis (DBT) is an advanced breast imaging modality that offers superior lesion detection accuracy compared to conventional mammography, albeit at the trade-off of longer reading time. Accelerating lesion detection from DBT using deep learning is hindered by limited data availability and huge annotation costs. A possible solution to this issue could be to leverage the information provided by a more widely available modality, such as mammography, to enhance DBT lesion detection. In this paper, we present a novel framework, CoMoTo, for improving lesion detection in DBT. Our framework leverages unpaired mammography data to enhance the training of a DBT model, improving practicality by eliminating the need for mammography during inference. Specifically, we propose two novel components, Lesion-specific Knowledge Distillation (LsKD) and Intra-modal Point Alignment (ImPA). LsKD selectively distills lesion features from a mammography teacher model to a DBT student model, disregarding background features. ImPA further enriches LsKD by ensuring the alignment of lesion features within the teacher before distilling knowledge to the student. Our comprehensive evaluation shows that CoMoTo is superior to traditional pretraining and image-level KD, improving performance by 7% Mean Sensitivity under low-data setting. Our code is available at https://github.com/Muhammad-Al-Barbary/CoMoTo .

26.7LGMay 1
A Framework for Exploring and Disentangling Intersectional Bias: A Case Study in Fetal Ultrasound

Aya Elgebaly, Joris Fournel, Benjamin Laine Jønch Jurgensen et al.

Bias in medical AI is often framed as a problem of representation. However, in image-based tasks such as fetal ultrasound, performance disparities can arise even when representation is adequate, because predictive accuracy depends strongly on image quality. Image quality is shaped by acquisition conditions and operator expertise, as well as patient-dependent factors such as maternal body mass index (BMI), all of which may correlate with sensitive demographic features. Consequently, observed disparities may reflect the combined influence of demographic, clinical, and acquisition-related factors rather than data imbalance alone, and may obscure underlying interaction or confounding effects. We propose a structured framework to explore and detect intersectional bias, combining unsupervised slice discovery, systematic factor-wise analysis, and targeted intersectional evaluation. In a case study of over 94{,}000 ultrasound images for fetal weight estimation, we analyze bias in a state-of-the-art deep learning (DL) model and the clinical standard Hadlock, a regression formula using biometric measurements. Pixel spacing (PS) -- a parameter considered suboptimal in current acquisition protocols -- emerged as a consistent driver of performance differences, with higher PS associated with improvements of up to 24\% in selected subgroups for both models. Because PS is often adapted in cases of high BMI or low gestational age (GA), this effect carries a substantial risk of confounding. Our intersectional analysis revealed that part of the PS-associated signal is explained by GA, while PS-related improvements persist across BMI strata, highlighting the importance of acquisition-aware and interaction-aware evaluation in medical AI fairness research.