CVNov 27, 2024

CHOICE: Benchmarking the Remote Sensing Capabilities of Large Vision-Language Models

arXiv:2411.18145v416.415 citationsh-index: 2Has Code

Originality Incremental advance

AI Analysis

This provides a standardized tool for researchers and practitioners in remote sensing to assess and improve VLM capabilities, though it is incremental as it builds on existing benchmarking efforts in specialized domains.

The authors tackled the lack of a systematic benchmark for evaluating large vision-language models (VLMs) in remote sensing by proposing CHOICE, an extensive benchmark with 10,507 problems across 23 tasks, which revealed critical limitations in 24 evaluated VLMs.

The rapid advancement of Large Vision-Language Models (VLMs), both general-domain models and those specifically tailored for remote sensing, has demonstrated exceptional perception and reasoning capabilities in Earth observation tasks. However, a benchmark for systematically evaluating their capabilities in this domain is still lacking. To bridge this gap, we propose CHOICE, an extensive benchmark designed to objectively evaluate the hierarchical remote sensing capabilities of VLMs. Focusing on 2 primary capability dimensions essential to remote sensing: perception and reasoning, we further categorize 6 secondary dimensions and 23 leaf tasks to ensure a well-rounded assessment coverage. CHOICE guarantees the quality of all 10,507 problems through a rigorous process of data collection from 50 globally distributed cities, question construction and quality control. The newly curated data and the format of multiple-choice questions with definitive answers allow for an objective and straightforward performance assessment. Our evaluation of 3 proprietary and 21 open-source VLMs highlights their critical limitations within this specialized context. We hope that CHOICE will serve as a valuable resource and offer deeper insights into the challenges and potential of VLMs in the field of remote sensing. We will release CHOICE at [this https URL](https://github.com/ShawnAn-WHU/CHOICE).

View on arXiv PDF Code

Similar