Mohamed A. Mabrok

h-index15

9papers

32citations

Novelty41%

AI Score48

Ranked #27,211 of 194,257 authors (top 14%)#183 in IV (top 4%)

9 Papers

5.7LGMar 17

Latent Semantic Manifolds in Large Language Models

Mohamed A. Mabrok

Large Language Models (LLMs) perform internal computations in continuous vector spaces yet produce discrete tokens -- a fundamental mismatch whose geometric consequences remain poorly understood. We develop a mathematical framework that interprets LLM hidden states as points on a latent semantic manifold: a Riemannian submanifold equipped with the Fisher information metric, where tokens correspond to Voronoi regions partitioning the manifold. We define the expressibility gap, a geometric measure of the semantic distortion from vocabulary discretization, and prove two theorems: a rate-distortion lower bound on distortion for any finite vocabulary, and a linear volume scaling law for the expressibility gap via the coarea formula. We validate these predictions across six transformer architectures (124M-1.5B parameters), confirming universal hourglass intrinsic dimension profiles, smooth curvature structure, and linear gap scaling with slopes 0.87-1.12 (R^2 > 0.985). The margin distribution across models reveals a persistent hard core of boundary-proximal representations invariant to scale, providing a geometric decomposition of perplexity. We discuss implications for architecture design, model compression, decoding strategies, and scaling laws

13.3IVMar 27, 2024

Transformers-based architectures for stroke segmentation: A review

Yalda Zafari-Ghadim, Essam A. Rashed, Mohamed Mabrok

Stroke remains a significant global health concern, necessitating precise and efficient diagnostic tools for timely intervention and improved patient outcomes. The emergence of deep learning methodologies has transformed the landscape of medical image analysis. Recently, Transformers, initially designed for natural language processing, have exhibited remarkable capabilities in various computer vision applications, including medical image analysis. This comprehensive review aims to provide an in-depth exploration of the cutting-edge Transformer-based architectures applied in the context of stroke segmentation. It commences with an exploration of stroke pathology, imaging modalities, and the challenges associated with accurate diagnosis and segmentation. Subsequently, the review delves into the fundamental ideas of Transformers, offering detailed insights into their architectural intricacies and the underlying mechanisms that empower them to effectively capture complex spatial information within medical images. The existing literature is systematically categorized and analyzed, discussing various approaches that leverage Transformers for stroke segmentation. A critical assessment is provided, highlighting the strengths and limitations of these methods, including considerations of performance and computational efficiency. Additionally, this review explores potential avenues for future research and development

10.3IVMar 25, 2024

Deep models for stroke segmentation: do complex architectures always perform better?

Yalda Zafari-Ghadim, Ahmed Soliman, Yousif Yousif et al.

Stroke segmentation plays a crucial role in the diagnosis and treatment of stroke patients by providing spatial information about affected brain regions and the extent of damage. Segmenting stroke lesions accurately is a challenging task, given that conventional manual techniques are time consuming and prone to errors. Recently, advanced deep models have been introduced for general medical image segmentation, demonstrating promising results that surpass many state of the art networks when evaluated on specific datasets. With the advent of the vision Transformers, several models have been introduced based on them, while others have aimed to design better modules based on traditional convolutional layers to extract long-range dependencies like Transformers. The question of whether such high-level designs are necessary for all segmentation cases to achieve the best results remains unanswered. In this study, we selected four types of deep models that were recently proposed and evaluated their performance for stroke segmentation: a pure Transformer-based architecture (DAE-Former), two advanced CNN-based models (LKA and DLKA) with attention mechanisms in their design, an advanced hybrid model that incorporates CNNs with Transformers (FCT), and the well-known self-adaptive nnUNet framework with its configuration based on given data. We examined their performance on two publicly available datasets, and found that the nnUNet achieved the best results with the simplest design among all. Revealing the robustness issue of Transformers to such variabilities serves as a potential reason for their weaker performance. Furthermore, nnUNet's success underscores the significant impact of preprocessing and postprocessing techniques in enhancing segmentation results, surpassing the focus solely on architectural designs

8.6IVJul 22, 2025

A Hybrid CNN-VSSM model for Multi-View, Multi-Task Mammography Analysis: Robust Diagnosis with Attention-Based Fusion

Yalda Zafari, Roaa Elalfy, Mohamed Mabrok et al.

Early and accurate interpretation of screening mammograms is essential for effective breast cancer detection, yet it remains a complex challenge due to subtle imaging findings and diagnostic ambiguity. Many existing AI approaches fall short by focusing on single view inputs or single-task outputs, limiting their clinical utility. To address these limitations, we propose a novel multi-view, multitask hybrid deep learning framework that processes all four standard mammography views and jointly predicts diagnostic labels and BI-RADS scores for each breast. Our architecture integrates a hybrid CNN VSSM backbone, combining convolutional encoders for rich local feature extraction with Visual State Space Models (VSSMs) to capture global contextual dependencies. To improve robustness and interpretability, we incorporate a gated attention-based fusion module that dynamically weights information across views, effectively handling cases with missing data. We conduct extensive experiments across diagnostic tasks of varying complexity, benchmarking our proposed hybrid models against baseline CNN architectures and VSSM models in both single task and multi task learning settings. Across all tasks, the hybrid models consistently outperform the baselines. In the binary BI-RADS 1 vs. 5 classification task, the shared hybrid model achieves an AUC of 0.9967 and an F1 score of 0.9830. For the more challenging ternary classification, it attains an F1 score of 0.7790, while in the five-class BI-RADS task, the best F1 score reaches 0.4904. These results highlight the effectiveness of the proposed hybrid framework and underscore both the potential and limitations of multitask learning for improving diagnostic performance and enabling clinically meaningful mammography analysis.

7.1LGJul 5, 2025

MCST-Mamba: Multivariate Mamba-Based Model for Traffic Prediction

Mohamed Hamad, Mohamed Mabrok, Nizar Zorba

Accurate traffic prediction plays a vital role in intelligent transportation systems by enabling efficient routing, congestion mitigation, and proactive traffic control. However, forecasting is challenging due to the combined effects of dynamic road conditions, varying traffic patterns across different locations, and external influences such as weather and accidents. Traffic data often consists of several interrelated measurements - such as speed, flow and occupancy - yet many deep-learning approaches either predict only one of these variables or require a separate model for each. This limits their ability to capture joint patterns across channels. To address this, we introduce the Multi-Channel Spatio-Temporal (MCST) Mamba model, a forecasting framework built on the Mamba selective state-space architecture that natively handles multivariate inputs and simultaneously models all traffic features. The proposed MCST-Mamba model integrates adaptive spatio-temporal embeddings and separates the modeling of temporal sequences and spatial sensor interactions into two dedicated Mamba blocks, improving representation learning. Unlike prior methods that evaluate on a single channel, we assess MCST-Mamba across all traffic features at once, aligning more closely with how congestion arises in practice. Our results show that MCST-Mamba achieves strong predictive performance with a lower parameter count compared to baseline models.

6.2CVNov 16, 2025

X-VMamba: Explainable Vision Mamba

Mohamed A. Mabrok, Yalda Zafari

State Space Models (SSMs), particularly the Mamba architecture, have recently emerged as powerful alternatives to Transformers for sequence modeling, offering linear computational complexity while achieving competitive performance. Yet, despite their effectiveness, understanding how these Vision SSMs process spatial information remains challenging due to the lack of transparent, attention-like mechanisms. To address this gap, we introduce a controllability-based interpretability framework that quantifies how different parts of the input sequence (tokens or patches) influence the internal state dynamics of SSMs. We propose two complementary formulations: a Jacobian-based method applicable to any SSM architecture that measures influence through the full chain of state propagation, and a Gramian-based approach for diagonal SSMs that achieves superior speed through closed-form analytical solutions. Both methods operate in a single forward pass with linear complexity, requiring no architectural modifications or hyperparameter tuning. We validate our framework through experiments on three diverse medical imaging modalities, demonstrating that SSMs naturally implement hierarchical feature refinement from diffuse low-level textures in early layers to focused, clinically meaningful patterns in deeper layers. Our analysis reveals domain-specific controllability signatures aligned with diagnostic criteria, progressive spatial selectivity across the network hierarchy, and the substantial influence of scanning strategies on attention patterns. Beyond medical imaging, we articulate applications spanning computer vision, natural language processing, and cross-domain tasks. Our framework establishes controllability analysis as a unified, foundational interpretability paradigm for SSMs across all domains. Code and analysis tools will be made available upon publication

8.6IVJul 15, 2025

Flatten Wisely: How Patch Order Shapes Mamba-Powered Vision for MRI Segmentation

Osama Hardan, Omar Elshenhabi, Tamer Khattab et al.

Vision Mamba models promise transformer-level performance at linear computational cost, but their reliance on serializing 2D images into 1D sequences introduces a critical, yet overlooked, design choice: the patch scan order. In medical imaging, where modalities like brain MRI contain strong anatomical priors, this choice is non-trivial. This paper presents the first systematic study of how scan order impacts MRI segmentation. We introduce Multi-Scan 2D (MS2D), a parameter-free module for Mamba-based architectures that facilitates exploring diverse scan paths without additional computational cost. We conduct a large-scale benchmark of 21 scan strategies on three public datasets (BraTS 2020, ISLES 2022, LGG), covering over 70,000 slices. Our analysis shows conclusively that scan order is a statistically significant factor (Friedman test: $χ^{2}_{20}=43.9, p=0.0016$), with performance varying by as much as 27 Dice points. Spatially contiguous paths -- simple horizontal and vertical rasters -- consistently outperform disjointed diagonal scans. We conclude that scan order is a powerful, cost-free hyperparameter, and provide an evidence-based shortlist of optimal paths to maximize the performance of Mamba models in medical imaging.

3.6IVJun 10, 2024

Neuro-TransUNet: Segmentation of stroke lesion in MRI using transformers

Muhammad Nouman, Mohamed Mabrok, Essam A. Rashed

Accurate segmentation of the stroke lesions using magnetic resonance imaging (MRI) is associated with difficulties due to the complicated anatomy of the brain and the different properties of the lesions. This study introduces the Neuro-TransUNet framework, which synergizes the U-Net's spatial feature extraction with SwinUNETR's global contextual processing ability, further enhanced by advanced feature fusion and segmentation synthesis techniques. The comprehensive data pre-processing pipeline improves the framework's efficiency, which involves resampling, bias correction, and data standardization, enhancing data quality and consistency. Ablation studies confirm the significant impact of the advanced integration of U-Net with SwinUNETR and data pre-processing pipelines on performance and demonstrate the model's effectiveness. The proposed Neuro-TransUNet model, trained with the ATLAS v2.0 \emph{training} dataset, outperforms existing deep learning algorithms and establishes a new benchmark in stroke lesion segmentation.

4.1ROJun 12, 2020Code

RISCuer: A Reliable Multi-UAV Search and Rescue Testbed

Mohamed Abdelkader, Usman A. Fiaz, Noureddine Toumi et al.

We present the Robotics Intelligent Systems & Control (RISC) Lab multiagent testbed for reliable search and rescue and aerial transport in outdoor environments. The system consists of a team of three multirotor unmanned aerial vehicles (UAVs), which are capable of autonomously searching, picking up, and transporting randomly distributed objects in an outdoor field. The method involves vision based object detection and localization, passive aerial grasping with our novel design, GPS based UAV navigation, and safe release of the objects at the drop zone. Our cooperative strategy ensures safe spatial separation between UAVs at all times and we prevent any conflicts at the drop zone using communication enabled consensus. All computation is performed onboard each UAV. We describe the complete software and hardware architecture for the system and demonstrate its reliable performance using comprehensive outdoor experiments, and by comparing our results with some recent, similar works.