Can AI Refute Economic Theory? Evidence from Beyond the Knowledge Cutoff
For economists and peer review, the paper shows AI can assist but not replace human judgment in detecting theoretical errors.
The paper tests whether AI models can identify errors in published economic theory papers. ChatGPT Pro performed best but still required substantial human guidance, and no model found errors autonomously.
Can artificial intelligence (AI) refute economic theory? I document experiments in which I asked several AI models (Gemini, Refine, Claude, and ChatGPT) to check the correctness of four published papers in economic theory, each containing an error that I helped identify or correct. ChatGPT Pro performed best, occasionally constructing counterexamples and corrected proofs, while other models fared worse. However, no model located a true error without substantial human guidance, and data contamination complicates interpretation. I argue that a competent human paired with a frontier model can outperform current peer review, but AI cannot yet refute economic theory on its own.