IVCVApr 20

A Controlled Benchmark of Visual State-Space Backbones with Domain-Shift and Boundary Analysis for Remote-Sensing Segmentation

arXiv:2604.187216.6h-index: 15
Predicted impact top 96% in IV · last 90 daysOriginality Synthesis-oriented
AI Analysis

For researchers in remote-sensing segmentation, this benchmark clarifies that future improvements should focus on robustness and boundary-aware decoding rather than encoder scaling.

This paper presents a controlled benchmark of visual state-space models (SSMs) for remote-sensing segmentation, isolating encoder effects. Key findings: intra-family scaling yields modest gains, cross-domain generalization is asymmetric, and boundary delineation is the main failure mode under distribution shift.

Visual state-space models (SSMs) are increasingly promoted as efficient alternatives to Vision Transformers, yet their practical advantages remain unclear under fair comparison because existing studies rarely isolate encoder effects from decoder and training choices. We present a strictly controlled benchmark of representative visual SSM families, including VMamba, MambaVision, and Spatial-Mamba, for remote-sensing semantic segmentation, in which only the encoder varies across experiments. Evaluated on LoveDA and ISPRS Potsdam under a unified 4-stage feature interface and a fixed lightweight decoder, the benchmark reveals three main findings, intra-family scaling yields only modest gains, cross-domain generalization is strongly asymmetric, and boundary delineation is the dominant failure mode under distribution shift. Although visual SSMs achieve favorable accuracy-efficiency trade-offs relative to the controlled CNN and Transformer baselines considered here, the results suggest that future improvements are more likely to come from robustness-oriented design and boundary-aware decoding than from encoder scaling alone. By isolating encoder behavior under a unified and reproducible protocol, this study establishes a practical reference benchmark for the design and evaluation of future Mamba-based segmentation backbones

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes