CVApr 11, 2024

ViM-UNet: Vision Mamba for Biomedical Segmentation

arXiv:2404.07705v213.513 citationsh-index: 4Has Code

Originality Incremental advance

AI Analysis

This work addresses segmentation efficiency and performance for biomedical imaging, offering a domain-specific incremental improvement over existing methods.

The authors tackled biomedical segmentation by introducing ViM-UNet, a novel architecture based on Vision Mamba, and found that it performs similarly or better than UNet and outperforms UNETR while being more efficient on two microscopy instance segmentation tasks.

CNNs, most notably the UNet, are the default architecture for biomedical segmentation. Transformer-based approaches, such as UNETR, have been proposed to replace them, benefiting from a global field of view, but suffering from larger runtimes and higher parameter counts. The recent Vision Mamba architecture offers a compelling alternative to transformers, also providing a global field of view, but at higher efficiency. Here, we introduce ViM-UNet, a novel segmentation architecture based on it and compare it to UNet and UNETR for two challenging microscopy instance segmentation tasks. We find that it performs similarly or better than UNet, depending on the task, and outperforms UNETR while being more efficient. Our code is open source and documented at https://github.com/constantinpape/torch-em/blob/main/vimunet.md.

View on arXiv PDF Code

Similar