CVMar 10, 2025

Can Generative Geospatial Diffusion Models Excel as Discriminative Geospatial Foundation Models?

arXiv:2503.07890v214 citationsh-index: 22Has Code
Originality Incremental advance
AI Analysis

This work addresses the need for better pretraining tools in remote sensing by showing that generative diffusion models can rival discriminative foundation models, though it appears incremental as it adapts existing diffusion techniques to a new application.

The paper tackles the problem of whether generative diffusion models can serve as effective discriminative geospatial foundation models for remote sensing tasks, and demonstrates that their SatDiFuser framework outperforms state-of-the-art methods with gains of up to +5.7% mIoU in semantic segmentation and +7.9% F1-score in classification.

Self-supervised learning (SSL) has revolutionized representation learning in Remote Sensing (RS), advancing Geospatial Foundation Models (GFMs) to leverage vast unlabeled satellite imagery for diverse downstream tasks. Currently, GFMs primarily employ objectives like contrastive learning or masked image modeling, owing to their proven success in learning transferable representations. However, generative diffusion models, which demonstrate the potential to capture multi-grained semantics essential for RS tasks during image generation, remain underexplored for discriminative applications. This prompts the question: can generative diffusion models also excel and serve as GFMs with sufficient discriminative power? In this work, we answer this question with SatDiFuser, a framework that transforms a diffusion-based generative geospatial foundation model into a powerful pretraining tool for discriminative RS. By systematically analyzing multi-stage, noise-dependent diffusion features, we develop three fusion strategies to effectively leverage these diverse representations. Extensive experiments on remote sensing benchmarks show that SatDiFuser outperforms state-of-the-art GFMs, achieving gains of up to +5.7% mIoU in semantic segmentation and +7.9% F1-score in classification, demonstrating the capacity of diffusion-based generative foundation models to rival or exceed discriminative GFMs. The source code is available at: https://github.com/yurujaja/SatDiFuser.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes