Self-supervised Pretraining of Cell Segmentation Models
For researchers in computational biology and microscopy image analysis, this work addresses the scarcity of labeled data by providing a domain-adapted pretraining method that significantly outperforms natural-image-based models.
DINOCell adapts DINOv2 representations to microscopy via continued self-supervised training on unlabeled cell images, achieving a SEG score of 0.784 on LIVECell (10.42% improvement over SAM-based models) and strong zero-shot performance on three out-of-distribution datasets.
Instance segmentation enables the analysis of spatial and temporal properties of cells in microscopy images by identifying the pixels belonging to each cell. However, progress is constrained by the scarcity of high-quality labeled microscopy datasets. Many recent approaches address this challenge by initializing models with segmentation-pretrained weights from large-scale natural-image models such as Segment Anything Model (SAM). However, representations learned from natural images often encode objectness and texture priors that are poorly aligned with microscopy data, leading to degraded performance under domain shift. We propose DINOCell, a self-supervised framework for cell instance segmentation that leverages representations from DINOv2 and adapts them to microscopy through continued self-supervised training on unlabeled cell images prior to supervised fine-tuning. On the LIVECell benchmark, DINOCell achieves a SEG score of 0.784, improving by 10.42% over leading SAM-based models, and demonstrates strong zero-shot performance on three out-of-distribution microscopy datasets. These results highlight the benefits of domain-adapted self-supervised pretraining for robust cell segmentation.