CVMar 3, 2025

Primus: Enforcing Attention Usage for 3D Medical Image Segmentation

arXiv:2503.01835v120 citationsh-index: 29
Originality Incremental advance
AI Analysis

This addresses the problem of improving segmentation accuracy for medical imaging applications, representing an incremental step towards making Transformers state-of-the-art in this domain.

The paper tackled the limited impact of Transformers on 3D medical image segmentation by introducing Primus, a fully Transformer-based architecture that surpasses current Transformer methods and competes with state-of-the-art convolutional models on multiple public datasets.

Transformers have achieved remarkable success across multiple fields, yet their impact on 3D medical image segmentation remains limited with convolutional networks still dominating major benchmarks. In this work, we a) analyze current Transformer-based segmentation models and identify critical shortcomings, particularly their over-reliance on convolutional blocks. Further, we demonstrate that in some architectures, performance is unaffected by the absence of the Transformer, thereby demonstrating their limited effectiveness. To address these challenges, we move away from hybrid architectures and b) introduce a fully Transformer-based segmentation architecture, termed Primus. Primus leverages high-resolution tokens, combined with advances in positional embeddings and block design, to maximally leverage its Transformer blocks. Through these adaptations Primus surpasses current Transformer-based methods and competes with state-of-the-art convolutional models on multiple public datasets. By doing so, we create the first pure Transformer architecture and take a significant step towards making Transformers state-of-the-art for 3D medical image segmentation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes