Self-guided Few-shot Semantic Segmentation for Remote Sensing Imagery Based on Large Vision Models
This work addresses the problem of automating segmentation in remote sensing for applications like land cover analysis, but it is incremental as it builds on existing large vision models.
The paper tackles few-shot semantic segmentation for remote sensing imagery by introducing a framework that automates the process using the Segment Anything Model (SAM) with a novel automatic prompt learning approach, achieving superior performance on the DLRSD datasets compared to other few-shot methods.
The Segment Anything Model (SAM) exhibits remarkable versatility and zero-shot learning abilities, owing largely to its extensive training data (SA-1B). Recognizing SAM's dependency on manual guidance given its category-agnostic nature, we identified unexplored potential within few-shot semantic segmentation tasks for remote sensing imagery. This research introduces a structured framework designed for the automation of few-shot semantic segmentation. It utilizes the SAM model and facilitates a more efficient generation of semantically discernible segmentation outcomes. Central to our methodology is a novel automatic prompt learning approach, leveraging prior guided masks to produce coarse pixel-wise prompts for SAM. Extensive experiments on the DLRSD datasets underline the superiority of our approach, outperforming other available few-shot methodologies.