CVMar 26

GeoHeight-Bench: Towards Height-Aware Multimodal Reasoning in Remote Sensing

arXiv:2603.2556593.5h-index: 9
Predicted impact top 11% in CV · last 90 daysOriginality Highly original
AI Analysis

This addresses a critical limitation in remote sensing for applications like disaster scenarios, though it is incremental as it builds on existing optical models.

The paper tackles the neglect of the vertical dimension in Large Multimodal Models for Earth Observation by introducing GeoHeight-Bench, a comprehensive evaluation framework for height-aware remote sensing understanding, and demonstrates that integrating height features mitigates the 'vertical blind spot' in existing models.

Current Large Multimodal Models (LMMs) in Earth Observation typically neglect the critical "vertical" dimension, limiting their reasoning capabilities in complex remote sensing geometries and disaster scenarios where physical spatial structures often outweigh planar visual textures. To bridge this gap, we introduce a comprehensive evaluation framework dedicated to height-aware remote sensing understanding. First, to overcome the severe scarcity of annotated data, we develop a scalable, VLM-driven data generation pipeline utilizing systematic prompt engineering and metadata extraction. This pipeline constructs two complementary benchmarks: GeoHeight-Bench for relative height analysis, and a more challenging GeoHeight-Bench+ for holistic, terrain-aware reasoning. Furthermore, to validate the necessity of height perception, we propose GeoHeightChat, the first height-aware remote sensing LMM baseline. Serving as a strong proof of concept, our baseline demonstrates that synergizing visual semantics with implicitly injected height geometric features effectively mitigates the "vertical blind spot", successfully unlocking a new paradigm of interactive height reasoning in existing optical models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes