Causes in neuron diagrams, and testing causal reasoning in Large Language Models. A glimpse of the future of philosophy?
This work addresses the challenge of evaluating causal reasoning in AI for philosophers and AI researchers, though it is incremental in proposing a test and definition based on existing philosophical frameworks.
The authors tackled the problem of testing abstract causal reasoning in AI by developing a test based on neuron diagrams from philosophy, and found that advanced LLMs like ChatGPT, DeepSeek, and Gemini can correctly identify causes in debated cases. They also proposed a new definition of cause in neuron diagrams with broader validity, challenging the view that such a definition is elusive.
We propose a test for abstract causal reasoning in AI, based on scholarship in the philosophy of causation, in particular on the neuron diagrams popularized by D. Lewis. We illustrate the test on advanced Large Language Models (ChatGPT, DeepSeek and Gemini). Remarkably, these chatbots are already capable of correctly identifying causes in cases that are hotly debated in the literature. In order to assess the results of these LLMs and future dedicated AI, we propose a definition of cause in neuron diagrams with a wider validity than published hitherto, which challenges the widespread view that such a definition is elusive. We submit that these results are an illustration of how future philosophical research might evolve: as an interplay between human and artificial expertise.