CVJan 16, 2025

Image Segmentation with transformers: An Overview, Challenges and Future

Deepjyoti Chetia, Debasish Dutta, Sanjib Kr Kalita

arXiv:2501.09372v12 citationsh-index: 3

Originality Synthesis-oriented

AI Analysis

It provides a survey for researchers and practitioners in computer vision to understand advancements and challenges in transformer-based segmentation, but it is incremental as it summarizes existing work rather than introducing new methods.

This paper reviews the shift from CNN-based to transformer-based models for image segmentation, highlighting how transformers address limitations like capturing spatial dependencies and varying object scales, and outlines future trends such as lightweight architectures.

Image segmentation, a key task in computer vision, has traditionally relied on convolutional neural networks (CNNs), yet these models struggle with capturing complex spatial dependencies, objects with varying scales, need for manually crafted architecture components and contextual information. This paper explores the shortcomings of CNN-based models and the shift towards transformer architectures -to overcome those limitations. This work reviews state-of-the-art transformer-based segmentation models, addressing segmentation-specific challenges and their solutions. The paper discusses current challenges in transformer-based segmentation and outlines promising future trends, such as lightweight architectures and enhanced data efficiency. This survey serves as a guide for understanding the impact of transformers in advancing segmentation capabilities and overcoming the limitations of traditional models.

View on arXiv PDF

Similar