CVAISep 26, 2025

Geo-R1: Improving Few-Shot Geospatial Referring Expression Understanding with Reinforcement Fine-Tuning

arXiv:2509.21976v215 citationsh-index: 7Has Code
Originality Incremental advance
AI Analysis

This addresses the challenge of accurate object localization in remote sensing with limited labeled data, offering improved generalization and interpretability for applications like environmental monitoring or urban planning, though it is incremental as it builds on existing multimodal large language models.

The paper tackles the problem of few-shot geospatial referring expression understanding, which struggles with poor generalization in data-scarce scenarios, and proposes Geo-R1, a reinforcement fine-tuning paradigm that improves performance by generating explicit reasoning chains, resulting in consistent and substantial outperformance over supervised fine-tuning baselines on three benchmarks.

Referring expression understanding in remote sensing poses unique challenges, as it requires reasoning over complex object-context relationships. While supervised fine-tuning (SFT) on multimodal large language models achieves strong performance with massive labeled datasets, they struggle in data-scarce scenarios, leading to poor generalization. To address this limitation, we propose Geo-R1, a reasoning-centric reinforcement fine-tuning (RFT) paradigm for few-shot geospatial referring. Geo-R1 enforces the model to first generate explicit, interpretable reasoning chains that decompose referring expressions, and then leverage these rationales to localize target objects. This "reason first, then act" process enables the model to make more effective use of limited annotations, enhances generalization, and provides interpretability. We validate Geo-R1 on three carefully designed few-shot geospatial referring benchmarks, where our model consistently and substantially outperforms SFT baselines. It also demonstrates strong cross-dataset generalization, highlighting its robustness. Code and data will be released at: https://github.com/Geo-R1/geo-r1.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes