AIMay 10

Absurd World: A Simple Yet Powerful Method to Absurdify the Real-world for Probing LLM Reasoning Capabilities

Ryan Albright, Golam Md Muktadir, Zarif Ikram, S M Jubaer, Mehrab Hossain, Dianbo Liu

arXiv:2605.0967839.2

AI Analysis

For researchers evaluating LLM reasoning, this framework provides a simple method to test whether models rely on memorized patterns rather than logical reasoning.

The paper proposes Absurd World, a benchmarking framework that alters real-world scenarios into logically coherent but absurd situations to test LLMs' logical reasoning. Experiments show that LLMs often fail on these simple tasks, revealing their reliance on real-world patterns rather than pure logic.

While extremely powerful and versatile at various tasks, the thinking capabilities of large language models (LLMs) are often put under scrutiny as they sometimes fail to solve problems that humans can systematically solve. However, recent literature focuses on breaking LLM reasoning with increasingly complex problems, and whether an LLM is robust in simple logical reasoning remains underexplored. This paper proposes Absurd World, a benchmarking framework, to test LLMs against altered realism, where scenarios are logically coherent, and humans can easily solve the tasks. Absurd World breaks a real-world model into symbols, actions, sequences, and events, which are automatically altered to create absurd worlds where the logic to solve the tasks remains the same. It evaluates a large collection of models with simple and advanced prompting techniques, and proves that it is an effective tool to determine LLMs' ability to think logically, ignoring the patterns learned from the real world. One can use this framework to extensively test an LLM against a real-world problem to verify whether the LLM's reasoning capability is robust against variations of the task.

View on arXiv PDF

Similar