Shape-Adapting Gated Experts: Dynamic Expert Routing for Colonoscopic Lesion Segmentation
This work addresses the problem of redundant computation and limited adaptability in medical image segmentation for cancer detection, representing an incremental improvement over existing CNN-Transformer hybrids.
The paper tackled the challenge of diverse cell scale and form in colonoscopic lesion segmentation by proposing Shape-Adapting Gated Experts (SAGE), a dynamic expert routing framework that achieved state-of-the-art Dice Scores of 95.57%, 95.16%, and 94.17% on three medical benchmarks.
The substantial diversity in cell scale and form remains a primary challenge in computer-aided cancer detection on gigapixel Whole Slide Images (WSIs), attributable to cellular heterogeneity. Existing CNN-Transformer hybrids rely on static computation graphs with fixed routing, which consequently causes redundant computation and limits their adaptability to input variability. We propose Shape-Adapting Gated Experts (SAGE), an input-adaptive framework that enables dynamic expert routing in heterogeneous visual networks. SAGE reconfigures static backbones into dynamically routed expert architectures. SAGE's dual-path design features a backbone stream that preserves representation and selectively activates an expert path through hierarchical gating. This gating mechanism operates at multiple hierarchical levels, performing a two-level, hierarchical selection between shared and specialized experts to modulate model logits for Top-K activation. Our Shape-Adapting Hub (SA-Hub) harmonizes structural and semantic representations across the CNN and the Transformer module, effectively bridging diverse modules. Embodied as SAGE-UNet, our model achieves superior segmentation on three medical benchmarks: EBHI, DigestPath, and GlaS, yielding state-of-the-art Dice Scores of 95.57%, 95.16%, and 94.17%, respectively, and robustly generalizes across domains by adaptively balancing local refinement and global context. SAGE provides a scalable foundation for dynamic expert routing, enabling flexible visual reasoning.