CL AI LGJul 25, 2025

Mitigating Geospatial Knowledge Hallucination in Large Language Models: Benchmarking and Dynamic Factuality Aligning

Shengyuan Wang, Jie Feng, Tianhui Liu, Dan Pei, Yong Li

Tsinghua

arXiv:2507.19586v110.95 citationsh-index: 7EMNLP

Originality Incremental advance

AI Analysis

This addresses the reliability issue of LLMs in geospatial tasks, which is a domain-specific problem, and is incremental as it builds on existing work on general knowledge hallucination.

The paper tackles the problem of geospatial knowledge hallucination in large language models (LLMs) by proposing a benchmark for evaluation and a dynamic factuality aligning method, resulting in a performance improvement of over 29.6% on the benchmark.

Large language models (LLMs) possess extensive world knowledge, including geospatial knowledge, which has been successfully applied to various geospatial tasks such as mobility prediction and social indicator prediction. However, LLMs often generate inaccurate geospatial knowledge, leading to geospatial hallucinations (incorrect or inconsistent representations of geospatial information) that compromise their reliability. While the phenomenon of general knowledge hallucination in LLMs has been widely studied, the systematic evaluation and mitigation of geospatial hallucinations remain largely unexplored. To address this gap, we propose a comprehensive evaluation framework for geospatial hallucinations, leveraging structured geospatial knowledge graphs for controlled assessment. Through extensive evaluation across 20 advanced LLMs, we uncover the hallucinations in their geospatial knowledge. Building on these insights, we introduce a dynamic factuality aligning method based on Kahneman-Tversky Optimization (KTO) to mitigate geospatial hallucinations in LLMs, leading to a performance improvement of over 29.6% on the proposed benchmark. Extensive experimental results demonstrate the effectiveness of our benchmark and learning algorithm in enhancing the trustworthiness of LLMs in geospatial knowledge and reasoning tasks.

View on arXiv PDF

Similar