An Evaluation of ChatGPT-4's Qualitative Spatial Reasoning Capabilities in RCC-8
This work addresses the problem of assessing LLM performance in spatial reasoning for researchers and practitioners in AI and related fields, but it is incremental as it focuses on a specific model and calculus.
The paper evaluated ChatGPT-4's ability to perform qualitative spatial reasoning tasks using the RCC-8 calculus, finding that it has limited capabilities in this area.
Qualitative Spatial Reasoning (QSR) is well explored area of Commonsense Reasoning and has multiple applications ranging from Geographical Information Systems to Robotics and Computer Vision. Recently many claims have been made for the capabilities of Large Language Models (LLMs). In this paper we investigate the extent to which one particular LLM can perform classical qualitative spatial reasoning tasks on the mereotopological calculus, RCC-8.