CV IVSep 26, 2025

On the Status of Foundation Models for SAR Imagery

arXiv:2509.21722v11 citationsh-index: 14

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of improving object recognition in SAR imagery for remote sensing applications, representing an incremental advancement by adapting existing self-supervised learning methods to a specific domain.

The paper investigates the viability of foundation models for Synthetic Aperture Radar (SAR) object recognition, finding that off-the-shelf visual models perform poorly, but self-supervised fine-tuning with SAR data sets a new state-of-the-art, significantly outperforming the best existing SAR-domain model SARATR-X.

In this work we investigate the viability of foundational AI/ML models for Synthetic Aperture Radar (SAR) object recognition tasks. We are inspired by the tremendous progress being made in the wider community, particularly in the natural image domain where frontier labs are training huge models on web-scale datasets with unprecedented computing budgets. It has become clear that these models, often trained with Self-Supervised Learning (SSL), will transform how we develop AI/ML solutions for object recognition tasks - they can be adapted downstream with very limited labeled data, they are more robust to many forms of distribution shift, and their features are highly transferable out-of-the-box. For these reasons and more, we are motivated to apply this technology to the SAR domain. In our experiments we first run tests with today's most powerful visual foundational models, including DINOv2, DINOv3 and PE-Core and observe their shortcomings at extracting semantically-interesting discriminative SAR target features when used off-the-shelf. We then show that Self-Supervised finetuning of publicly available SSL models with SAR data is a viable path forward by training several AFRL-DINOv2s and setting a new state-of-the-art for SAR foundation models, significantly outperforming today's best SAR-domain model SARATR-X. Our experiments further analyze the performance trade-off of using different backbones with different downstream task-adaptation recipes, and we monitor each model's ability to overcome challenges within the downstream environments (e.g., extended operating conditions and low amounts of labeled data). We hope this work will inform and inspire future SAR foundation model builders, because despite our positive results, we still have a long way to go.

View on arXiv PDF

Similar