29.9CVMay 18
Patch-MoE Mamba: A Patch-Ordered Mixture-of-Experts State Space Architecture for Medical Image SegmentationDiego Adame, Fabian Vazquez, Jose A. Nunez et al.
CNN- and Transformer-based architectures have achieved strong performance in medical image segmentation, but CNNs are limited in modeling long-range dependencies, while Transformers often suffer from quadratic computational and memory complexity. State space models, especially Mamba-based networks, offer an efficient alternative with linear sequence complexity. However, existing Mamba segmentation models still face two limitations: pixel-wise directional scanning can disrupt local 2D spatial structure, and simple summation-based fusion of scan directions cannot adapt well to diverse object sizes, shapes, and boundaries. To address these issues, we propose \textit{Patch-MoE Mamba}, a patch-ordered mixture-of-experts state space architecture for medical image segmentation. It introduces a hierarchical patch-ordered scanning mechanism that preserves local spatial neighborhoods while capturing multi-scale context, and an MoE-based directional fusion module that adaptively combines multiple Mamba scanner outputs using four directional experts, a learnable concatenation expert, and residual directional aggregation. Experiments on five public polyp segmentation benchmarks and the ISIC 2017/2018 skin lesion segmentation datasets demonstrate the effectiveness and generality of Patch-MoE Mamba.
CVDec 8, 2025
Integrating Multi-scale and Multi-filtration Topological Features for Medical Image ClassificationPengfei Gu, Huimin Li, Haoteng Tang et al.
Modern deep neural networks have shown remarkable performance in medical image classification. However, such networks either emphasize pixel-intensity features instead of fundamental anatomical structures (e.g., those encoded by topological invariants), or they capture only simple topological features via single-parameter persistence. In this paper, we propose a new topology-guided classification framework that extracts multi-scale and multi-filtration persistent topological features and integrates them into vision classification backbones. For an input image, we first compute cubical persistence diagrams (PDs) across multiple image resolutions/scales. We then develop a ``vineyard'' algorithm that consolidates these PDs into a single, stable diagram capturing signatures at varying granularities, from global anatomy to subtle local irregularities that may indicate early-stage disease. To further exploit richer topological representations produced by multiple filtrations, we design a cross-attention-based neural network that directly processes the consolidated final PDs. The resulting topological embeddings are fused with feature maps from CNNs or Transformers. By integrating multi-scale and multi-filtration topologies into an end-to-end architecture, our approach enhances the model's capacity to recognize complex anatomical structures. Evaluations on three public datasets show consistent, considerable improvements over strong baselines and state-of-the-art methods, demonstrating the value of our comprehensive topological perspective for robust and interpretable medical image classification.
CVNov 14, 2017
A Multiple Radar Approach for Automatic Target Recognition of Aircraft using Inverse Synthetic Aperture RadarCarlos Pena-Caballero, Elifaleth Cantu, Jesus Rodriguez et al.
Along with the improvement of radar technologies, Automatic Target Recognition (ATR) using Synthetic Aperture Radar (SAR) and Inverse SAR (ISAR) has come to be an active research area. SAR/ISAR are radar techniques to generate a two-dimensional high-resolution image of a target. Unlike other similar experiments using Convolutional Neural Networks (CNN) to solve this problem, we utilize an unusual approach that leads to better performance and faster training times. Our CNN uses complex values generated by a simulation to train the network; additionally, we utilize a multi-radar approach to increase the accuracy of the training and testing processes, thus resulting in higher accuracies than the other papers working on SAR/ISAR ATR. We generated our dataset with 7 different aircraft models with a radar simulator we developed called RadarPixel; it is a Windows GUI program implemented using Matlab and Java programming, the simulator is capable of accurately replicating a real SAR/ISAR configurations. Our objective is to utilize our multi-radar technique and determine the optimal number of radars needed to detect and classify targets.