Deterministic Event-Graph Substrates as World Models for Counterfactual Reasoning

arXiv:2605.1596720.9

AI Analysis

For AI systems requiring interpretable and exact counterfactual reasoning, this work provides a domain-agnostic substrate that outperforms both symbolic and parametric baselines, though it lags on predictive tasks.

The paper introduces deterministic event-graph substrates as world models for counterfactual reasoning, achieving superior performance over symbolic oracles on CLEVRER (up to 20.26 pp gain) and outperforming Llama-3.1-8B on a new counterfactual benchmark (18.80 pp gain).

We study event-graph substrates: a class of world models that represent agent state as an append-only log of typed RDF triples and answer counterfactual queries by forking the log under a structured intervention vocabulary. Substrates are inspectable at the triple level, support exact counterfactuals, and transfer across domains without learned components. We formalize the class, prove a duality between explanatory and counterfactual queries that reduces both to the same causal-ancestor traversal, and evaluate a 1,400-line CLEVRER-DSL interpreter atop a domain-agnostic substrate runtime at full CLEVRER validation scale (n=75,618). The substrate exceeds the NS-DR symbolic oracle on all four per-question categories (by 9.89, 20.26, 17.65, and 0.80 percentage points), and exceeds the parametric ALOE baseline on descriptive and explanatory while lagging on predictive and counterfactual. We also introduce twin-EventLog, a 500-specification Park-canonical Smallville counterfactual benchmark on which the substrate exceeds Llama-3.1-8B with full context by 18.80 points joint accuracy.

View on arXiv PDF

Similar