SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model
This provides a valuable resource for researchers in remote sensing to address data scarcity for segmentation tasks, though it is incremental as it builds on existing models and datasets.
The authors tackled the problem of limited annotated remote sensing (RS) segmentation data by using the Segment Anything Model (SAM) and existing RS object detection datasets to create SAMRS, a large-scale dataset with 105,090 images and 1,668,241 instances, which is orders of magnitude larger than existing datasets.
The success of the Segment Anything Model (SAM) demonstrates the significance of data-centric machine learning. However, due to the difficulties and high costs associated with annotating Remote Sensing (RS) images, a large amount of valuable RS data remains unlabeled, particularly at the pixel level. In this study, we leverage SAM and existing RS object detection datasets to develop an efficient pipeline for generating a large-scale RS segmentation dataset, dubbed SAMRS. SAMRS totally possesses 105,090 images and 1,668,241 instances, surpassing existing high-resolution RS segmentation datasets in size by several orders of magnitude. It provides object category, location, and instance information that can be used for semantic segmentation, instance segmentation, and object detection, either individually or in combination. We also provide a comprehensive analysis of SAMRS from various aspects. Moreover, preliminary experiments highlight the importance of conducting segmentation pre-training with SAMRS to address task discrepancies and alleviate the limitations posed by limited training data during fine-tuning. The code and dataset will be available at https://github.com/ViTAE-Transformer/SAMRS.