NIAILGOct 28, 2025

Enabling Near-realtime Remote Sensing via Satellite-Ground Collaboration of Large Vision-Language Models

arXiv:2510.24242v1h-index: 15
Originality Incremental advance
AI Analysis

This work addresses the problem of real-time remote sensing for applications like disaster monitoring, though it is incremental as it builds on existing LVLM and collaborative system concepts.

The paper tackles the challenge of deploying large vision-language models (LVLMs) for near-realtime remote sensing tasks on low Earth orbit satellites by proposing Grace, a satellite-ground collaborative system that reduces average latency by 76-95% compared to state-of-the-art methods while maintaining inference accuracy.

Large vision-language models (LVLMs) have recently demonstrated great potential in remote sensing (RS) tasks (e.g., disaster monitoring) conducted by low Earth orbit (LEO) satellites. However, their deployment in real-world LEO satellite systems remains largely unexplored, hindered by limited onboard computing resources and brief satellite-ground contacts. We propose Grace, a satellite-ground collaborative system designed for near-realtime LVLM inference in RS tasks. Accordingly, we deploy compact LVLM on satellites for realtime inference, but larger ones on ground stations (GSs) to guarantee end-to-end performance. Grace is comprised of two main phases that are asynchronous satellite-GS Retrieval-Augmented Generation (RAG), and a task dispatch algorithm. Firstly, we still the knowledge archive of GS RAG to satellite archive with tailored adaptive update algorithm during limited satellite-ground data exchange period. Secondly, propose a confidence-based test algorithm that either processes the task onboard the satellite or offloads it to the GS. Extensive experiments based on real-world satellite orbital data show that Grace reduces the average latency by 76-95% compared to state-of-the-art methods, without compromising inference accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes