IVCVMar 16, 2022

CapsNet for Medical Image Segmentation

arXiv:2203.08948v18 citationsh-index: 31
Originality Synthesis-oriented
AI Analysis

This work is incremental, reviewing and discussing CapsNet applications for medical image segmentation to potentially reduce data needs and enhance model robustness in a domain with high annotation costs and privacy concerns.

The paper addresses the limitations of CNNs in medical image segmentation, such as sensitivity to transformations and reliance on large datasets, by exploring CapsNet architectures for improved robustness and part-whole relationship preservation, though no specific performance numbers are provided.

Convolutional Neural Networks (CNNs) have been successful in solving tasks in computer vision including medical image segmentation due to their ability to automatically extract features from unstructured data. However, CNNs are sensitive to rotation and affine transformation and their success relies on huge-scale labeled datasets capturing various input variations. This network paradigm has posed challenges at scale because acquiring annotated data for medical segmentation is expensive, and strict privacy regulations. Furthermore, visual representation learning with CNNs has its own flaws, e.g., it is arguable that the pooling layer in traditional CNNs tends to discard positional information and CNNs tend to fail on input images that differ in orientations and sizes. Capsule network (CapsNet) is a recent new architecture that has achieved better robustness in representation learning by replacing pooling layers with dynamic routing and convolutional strides, which has shown potential results on popular tasks such as classification, recognition, segmentation, and natural language processing. Different from CNNs, which result in scalar outputs, CapsNet returns vector outputs, which aim to preserve the part-whole relationships. In this work, we first introduce the limitations of CNNs and fundamentals of CapsNet. We then provide recent developments of CapsNet for the task of medical image segmentation. We finally discuss various effective network architectures to implement a CapsNet for both 2D images and 3D volumetric medical image segmentation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes