59.1CVMay 6Code
FairEnc: A Fair Vision-Language Model with Fair Vision and Text Encoders for Glaucoma DetectionMohamed Elhabebe, Ayman El-Baz, Qing Liu
Automated glaucoma detection is critical for preventing irreversible vision loss and reducing the burden on healthcare systems. However, ensuring fairness across diverse patient populations remains a significant challenge. In this paper, we propose FairEnc, a fair pretraining method for vision-language models (VLMs) that enables simultaneous debiasing across multiple sensitive attributes. FairEnc jointly mitigates biases in both textual and visual modalities with respect to multiple sensitive attributes, including race, gender, ethnicity, and language. Specifically, for the textual encoder, we leverage a large language model to generate synthetic clinical descriptions with varied sensitive attributes while preserving disease semantics, and employ a contrastive alignment objective to encourage demographic-invariant representations. For the visual encoder, we propose a dual-level fairness strategy that combines mutual information regularization to reduce statistical dependence between learned features and demographic groups, with multi-discriminator adversarial debiasing. Comprehensive experiments on the publicly available Harvard-FairVLMed dataset demonstrate that FairEnc effectively reduces demographic disparity as measured by DPD and DEOdds while achieving strong diagnostic performance under both zero-shot and linear probing evaluations. Additional experiments on the private FairFundus dataset show that FairEnc consistently preserves fairness advantages under cross-domain and cross-modality settings and maintains diagnostic performance within a competitive range. These results highlight FairEnc's ability to generalize fairness under distribution shifts, supporting its potential for more equitable deployment in real-world clinical settings. Our codebase and synthetic clinical notes are available at https://github.com/Mohamed-Elhabebe/FairEnc
CVNov 1, 2020
A Dilated Residual Hierarchically Fashioned Segmentation Framework for Extracting Gleason Tissues and Grading Prostate Cancer from Whole Slide ImagesTaimur Hassan, Bilal Hassan, Ayman El-Baz et al.
Prostate cancer (PCa) is the second deadliest form of cancer in males, and it can be clinically graded by examining the structural representations of Gleason tissues. This paper proposes \RV{a new method} for segmenting the Gleason tissues \RV{(patch-wise) in order to grade PCa from the whole slide images (WSI).} Also, the proposed approach encompasses two main contributions: 1) A synergy of hybrid dilation factors and hierarchical decomposition of latent space representation for effective Gleason tissues extraction, and 2) A three-tiered loss function which can penalize different semantic segmentation models for accurately extracting the highly correlated patterns. In addition to this, the proposed framework has been extensively evaluated on a large-scale PCa dataset containing 10,516 whole slide scans (with around 71.7M patches), where it outperforms state-of-the-art schemes by 3.22% (in terms of mean intersection-over-union) for extracting the Gleason tissues and 6.91% (in terms of F1 score) for grading the progression of PCa.
LGJul 2, 2016
Alzheimer's Disease Diagnostics by a Deeply Supervised Adaptable 3D Convolutional NetworkEhsan Hosseini-Asl, Georgy Gimel'farb, Ayman El-Baz
Early diagnosis, playing an important role in preventing progress and treating the Alzheimer's disease (AD), is based on classification of features extracted from brain images. The features have to accurately capture main AD-related variations of anatomical brain structures, such as, e.g., ventricles size, hippocampus shape, cortical thickness, and brain volume. This paper proposes to predict the AD with a deep 3D convolutional neural network (3D-CNN), which can learn generic features capturing AD biomarkers and adapt to different domain datasets. The 3D-CNN is built upon a 3D convolutional autoencoder, which is pre-trained to capture anatomical shape variations in structural brain MRI scans. Fully connected upper layers of the 3D-CNN are then fine-tuned for each task-specific AD classification. Experiments on the \emph{ADNI} MRI dataset with no skull-stripping preprocessing have shown our 3D-CNN outperforms several conventional classifiers by accuracy and robustness. Abilities of the 3D-CNN to generalize the features learnt and adapt to other domains have been validated on the \emph{CADDementia} dataset.
LGJul 2, 2016
Alzheimer's Disease Diagnostics by Adaptation of 3D Convolutional NetworkEhsan Hosseini-Asl, Robert Keynto, Ayman El-Baz
Early diagnosis, playing an important role in preventing progress and treating the Alzheimer\{'}s disease (AD), is based on classification of features extracted from brain images. The features have to accurately capture main AD-related variations of anatomical brain structures, such as, e.g., ventricles size, hippocampus shape, cortical thickness, and brain volume. This paper proposed to predict the AD with a deep 3D convolutional neural network (3D-CNN), which can learn generic features capturing AD biomarkers and adapt to different domain datasets. The 3D-CNN is built upon a 3D convolutional autoencoder, which is pre-trained to capture anatomical shape variations in structural brain MRI scans. Fully connected upper layers of the 3D-CNN are then fine-tuned for each task-specific AD classification. Experiments on the CADDementia MRI dataset with no skull-stripping preprocessing have shown our 3D-CNN outperforms several conventional classifiers by accuracy. Abilities of the 3D-CNN to generalize the features learnt and adapt to other domains have been validated on the ADNI dataset.