CVMar 8, 2024

Exploring Robust Features for Few-Shot Object Detection in Satellite Imagery

arXiv:2403.05381v112 citationsh-index: 152024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
Originality Incremental advance
AI Analysis

This addresses the problem of detecting rare objects in satellite imagery with minimal annotations for remote sensing users, but it is incremental as it adapts existing open-vocabulary detection ideas to a specific domain.

The paper tackled few-shot object detection in satellite imagery by developing a prototype-based classifier with pre-trained models, achieving superior performance over fully supervised and few-shot methods on SIMD and DIOR datasets.

The goal of this paper is to perform object detection in satellite imagery with only a few examples, thus enabling users to specify any object class with minimal annotation. To this end, we explore recent methods and ideas from open-vocabulary detection for the remote sensing domain. We develop a few-shot object detector based on a traditional two-stage architecture, where the classification block is replaced by a prototype-based classifier. A large-scale pre-trained model is used to build class-reference embeddings or prototypes, which are compared to region proposal contents for label prediction. In addition, we propose to fine-tune prototypes on available training images to boost performance and learn differences between similar classes, such as aircraft types. We perform extensive evaluations on two remote sensing datasets containing challenging and rare objects. Moreover, we study the performance of both visual and image-text features, namely DINOv2 and CLIP, including two CLIP models specifically tailored for remote sensing applications. Results indicate that visual features are largely superior to vision-language models, as the latter lack the necessary domain-specific vocabulary. Lastly, the developed detector outperforms fully supervised and few-shot methods evaluated on the SIMD and DIOR datasets, despite minimal training parameters.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes