CRAug 30, 2024
Hybridizing Base-Line 2D-CNN Model with Cat Swarm Optimization for Enhanced Advanced Persistent Threat DetectionAli M. Bakhiet, Salah A. Aly
In the realm of cyber-security, detecting Advanced Persistent Threats (APTs) remains a formidable challenge due to their stealthy and sophisticated nature. This research paper presents an innovative approach that leverages Convolutional Neural Networks (CNNs) with a 2D baseline model, enhanced by the cutting-edge Cat Swarm Optimization (CSO) algorithm, to significantly improve APT detection accuracy. By seamlessly integrating the 2D-CNN baseline model with CSO, we unlock the potential for unprecedented accuracy and efficiency in APT detection. The results unveil an impressive accuracy score of $98.4\%$, marking a significant enhancement in APT detection across various attack stages, illuminating a path forward in combating these relentless and sophisticated threats.
CVApr 17
Early Detection of Acute Myeloid Leukemia (AML) Using YOLOv12 Deep Learning ModelEnas E. Ahmed, Salah A. Aly, Mayar Moner
Acute Myeloid Leukemia (AML) is one of the most life-threatening type of blood cancers, and its accurate classification is considered and remains a challenging task due to the visual similarity between various cell types. This study addresses the classification of the multiclasses of AML cells Utilizing YOLOv12 deep learning model. We applied two segmentation approaches based on cell and nucleus features, using Hue channel and Otsu thresholding techniques to preprocess the images prior to classification. Our experiments demonstrate that YOLOv12 with Otsu thresholding on cell-based segmentation achieved the highest level of validation and test accuracy, both reaching 99.3%.
CVOct 14, 2024
Early Diagnosis of Acute Lymphoblastic Leukemia Using YOLOv8 and YOLOv11 Deep Learning ModelsAlaa Awad, Salah A. Aly
Leukemia, a severe form of blood cancer, claims thousands of lives each year. This study focuses on the detection of Acute Lymphoblastic Leukemia (ALL) using advanced image processing and deep learning techniques. By leveraging recent advancements in artificial intelligence, the research evaluates the reliability of these methods in practical, real-world scenarios. Specifically, it examines the performance of state-of-the-art YOLO models, including YOLOv8 and YOLOv11, to distinguish between malignant and benign white blood cells and accurately identify different stages of ALL, including early stages. Moreover, the models demonstrate the ability to detect hematogones, which are frequently misclassified as ALL. With accuracy rates reaching 98.8%, this study highlights the potential of these algorithms to provide robust and precise leukemia detection across diverse datasets and conditions.
IVMar 31, 2025
Detecting Glioma, Meningioma, and Pituitary Tumors, and Normal Brain Tissues based on Yolov11 and Yolov8 Deep Learning ModelsAhmed M. Taha, Salah A. Aly, Mohamed F. Darwish
Accurate and quick diagnosis of normal brain tissue Glioma, Meningioma, and Pituitary Tumors is crucial for optimal treatment planning and improved medical results. Magnetic Resonance Imaging (MRI) is widely used as a non-invasive diagnostic tool for detecting brain abnormalities, including tumors. However, manual interpretation of MRI scans is often time-consuming, prone to human error, and dependent on highly specialized expertise. This paper proposes an advanced AI-driven technique to detecting glioma, meningioma, and pituitary brain tumors using YoloV11 and YoloV8 deep learning models. Methods: Using a transfer learning-based fine-tuning approach, we integrate cutting-edge deep learning techniques with medical imaging to classify brain tumors into four categories: No-Tumor, Glioma, Meningioma, and Pituitary Tumors. Results: The study utilizes the publicly accessible CE-MRI Figshare dataset and involves fine-tuning pre-trained models YoloV8 and YoloV11 of 99.49% and 99.56% accuracies; and customized CNN accuracy of 96.98%. The results validate the potential of CNNs in achieving high precision in brain tumor detection and classification, highlighting their transformative role in medical imaging and diagnostics.
CVJan 14, 2025
Revolutionizing Communication with Deep Learning and XAI for Enhanced Arabic Sign Language RecognitionMazen Balat, Rewaa Awaad, Ahmed B. Zaky et al.
This study introduces an integrated approach to recognizing Arabic Sign Language (ArSL) using state-of-the-art deep learning models such as MobileNetV3, ResNet50, and EfficientNet-B2. These models are further enhanced by explainable AI (XAI) techniques to boost interpretability. The ArSL2018 and RGB Arabic Alphabets Sign Language (AASL) datasets are employed, with EfficientNet-B2 achieving peak accuracies of 99.48\% and 98.99\%, respectively. Key innovations include sophisticated data augmentation methods to mitigate class imbalance, implementation of stratified 5-fold cross-validation for better generalization, and the use of Grad-CAM for clear model decision transparency. The proposed system not only sets new benchmarks in recognition accuracy but also emphasizes interpretability, making it suitable for applications in healthcare, education, and inclusive communication technologies.
IVFeb 13, 2025
Acute Lymphoblastic Leukemia Diagnosis Employing YOLOv11, YOLOv8, ResNet50, and Inception-ResNet-v2 Deep Learning ModelsAlaa Awad, Salah A. Aly
Thousands of individuals succumb annually to leukemia alone. As artificial intelligence-driven technologies continue to evolve and advance, the question of their applicability and reliability remains unresolved. This study aims to utilize image processing and deep learning methodologies to achieve state-of-the-art results for the detection of Acute Lymphoblastic Leukemia (ALL) using data that best represents real-world scenarios. ALL is one of several types of blood cancer, and it is an aggressive form of leukemia. In this investigation, we examine the most recent advancements in ALL detection, as well as the latest iteration of the YOLO series and its performance. We address the question of whether white blood cells are malignant or benign. Additionally, the proposed models can identify different ALL stages, including early stages. Furthermore, these models can detect hematogones despite their frequent misclassification as ALL. By utilizing advanced deep learning models, namely, YOLOv8, YOLOv11, ResNet50 and Inception-ResNet-v2, the study achieves accuracy rates as high as 99.7%, demonstrating the effectiveness of these algorithms across multiple datasets and various real-world situations.
CVOct 18, 2024
Explainable AI in Handwriting Detection for Dyslexia Using Transfer LearningMahmoud Robaa, Mazen Balat, Rewaa Awaad et al.
This study introduces an explainable AI (XAI) framework for the detection of dyslexia through handwriting analysis, achieving an impressive test precision of 99.65%. The framework integrates transfer learning and transformer-based models, identifying handwriting features associated with dyslexia while ensuring transparency in decision-making via Grad-CAM visualizations. Its adaptability to different languages and writing systems underscores its potential for global applicability. By surpassing the classification accuracy of state-of-the-art methods, this approach demonstrates the reliability of handwriting analysis as a diagnostic tool. The findings emphasize the framework's ability to support early detection, build stakeholder trust, and enable personalized educational strategies.
SDNov 11, 2021
Towards an Efficient Voice Identification Using Wav2Vec2.0 and HuBERT Based on the Quran Reciters DatasetAly Moustafa, Salah A. Aly
Current authentication and trusted systems depend on classical and biometric methods to recognize or authorize users. Such methods include audio speech recognitions, eye, and finger signatures. Recent tools utilize deep learning and transformers to achieve better results. In this paper, we develop a deep learning constructed model for Arabic speakers identification by using Wav2Vec2.0 and HuBERT audio representation learning tools. The end-to-end Wav2Vec2.0 paradigm acquires contextualized speech representations learnings by randomly masking a set of feature vectors, and then applies a transformer neural network. We employ an MLP classifier that is able to differentiate between invariant labeled classes. We show several experimental results that safeguard the high accuracy of the proposed model. The experiments ensure that an arbitrary wave signal for a certain speaker can be identified with 98% and 97.1% accuracies in the cases of Wav2Vec2.0 and HuBERT, respectively.
CLNov 1, 2021
ASMDD: Arabic Speech Mispronunciation Detection DatasetSalah A. Aly, Abdelrahman Salah, Hesham M. Eraqi
The largest dataset of Arabic speech mispronunciation detections in Egyptian dialogues is introduced. The dataset is composed of annotated audio files representing the top 100 words that are most frequently used in the Arabic language, pronounced by 100 Egyptian children (aged between 2 and 8 years old). The dataset is collected and annotated on segmental pronunciation error detections by expert listeners.
CVOct 9, 2021
Arabic Speech Emotion Recognition Employing Wav2vec2.0 and HuBERT Based on BAVED DatasetOmar Mohamed, Salah A. Aly
Recently, there have been tremendous research outcomes in the fields of speech recognition and natural language processing. This is due to the well-developed multi-layers deep learning paradigms such as wav2vec2.0, Wav2vecU, WavBERT, and HuBERT that provide better representation learning and high information capturing. Such paradigms run on hundreds of unlabeled data, then fine-tuned on a small dataset for specific tasks. This paper introduces a deep learning constructed emotional recognition model for Arabic speech dialogues. The developed model employs the state of the art audio representations include wav2vec2.0 and HuBERT. The experiment and performance results of our model overcome the previous known outcomes.
CVSep 15, 2012
A Hajj And Umrah Location Classification System For Video Crowded ScenesHossam M. Zawbaa, Salah A. Aly, Adnan A. Gutub
In this paper, a new automatic system for classifying ritual locations in diverse Hajj and Umrah video scenes is investigated. This challenging subject has mostly been ignored in the past due to several problems one of which is the lack of realistic annotated video datasets. HUER Dataset is defined to model six different Hajj and Umrah ritual locations[26]. The proposed Hajj and Umrah ritual location classifying system consists of four main phases: Preprocessing, segmentation, feature extraction, and location classification phases. The shot boundary detection and background/foregroud segmentation algorithms are applied to prepare the input video scenes into the KNN, ANN, and SVM classifiers. The system improves the state of art results on Hajj and Umrah location classifications, and successfully recognizes the six Hajj rituals with more than 90% accuracy. The various demonstrated experiments show the promising results.
CVAug 27, 2012
A Missing and Found Recognition System for Hajj and UmrahSalah A. Aly
This note describes an integrated recognition system for identifying missing and found objects as well as missing, dead, and found people during Hajj and Umrah seasons in the two Holy cities of Makkah and Madina in the Kingdom of Saudi Arabia. It is assumed that the total estimated number of pilgrims will reach 20 millions during the next decade. The ultimate goal of this system is to integrate facial recognition and object identification solutions into the Hajj and Umrah rituals. The missing and found computerized system is part of the CrowdSensing system for Hajj and Umrah crowd estimation, management and safety.
CVMay 20, 2012
Pilgrims Face Recognition Dataset -- HUFRDSalah A. Aly
In this work, we define a new pilgrims face recognition dataset, called HUFRD dataset. The new developed dataset presents various pilgrims' images taken from outside the Holy Masjid El-Harram in Makkah during the 2011-2012 Hajj and Umrah seasons. Such dataset will be used to test our developed facial recognition and detection algorithms, as well as assess in the missing and found recognition system \cite{crowdsensing}.
CVMay 10, 2012
Hajj and Umrah Event Recognition DatasetsHossam Zawbaa, Salah A. Aly
In this note, new Hajj and Umrah Event Recognition datasets (HUER) are presented. The demonstrated datasets are based on videos and images taken during 2011-2012 Hajj and Umrah seasons. HUER is the first collection of datasets covering the six types of Hajj and Umrah ritual events (rotating in Tawaf around Kabaa, performing Sa'y between Safa and Marwa, standing on the mount of Arafat, staying overnight in Muzdalifah, staying two or three days in Mina, and throwing Jamarat). The HUER datasets also contain video and image databases for nine types of human actions during Hajj and Umrah (walking, drinking from Zamzam water, sleeping, smiling, eating, praying, sitting, shaving hairs and ablutions, reading the holy Quran and making duaa). The spatial resolutions are 1280 x 720 pixels for images and 640 x 480 pixels for videos and have lengths of 20 seconds in average with 30 frame per second rates.
CVFeb 11, 2012
Efficient Web-based Facial Recognition System Employing 2DHOGMoataz M. Abdelwahab, Salah A. Aly, Islam Yousry
In this paper, a system for facial recognition to identify missing and found people in Hajj and Umrah is described as a web portal. Explicitly, we present a novel algorithm for recognition and classifications of facial images based on applying 2DPCA to a 2D representation of the Histogram of oriented gradients (2D-HOG) which maintains the spatial relation between pixels of the input images. This algorithm allows a compact representation of the images which reduces the computational complexity and the storage requirments, while maintaining the highest reported recognition accuracy. This promotes this method for usage with very large datasets. Large dataset was collected for people in Hajj. Experimental results employing ORL, UMIST, JAFFE, and HAJJ datasets confirm these excellent properties.