The Ideation Bottleneck: Decomposing the Quality Gap Between AI-Generated and Human Economics Research
For the economics research community, it identifies ideation as the primary bottleneck preventing AI-generated papers from matching human quality.
The paper decomposes the quality gap between AI-generated and human economics research into idea quality and execution quality, finding that idea quality accounts for 71% of the gap (Cohen's d=2.23) and only 0.8% of AI papers surpass the median human paper on both dimensions.
Autonomous AI systems can now generate complete economics research papers, but they substantially underperform human-authored publications in head-to-head comparisons. This paper decomposes the quality gap into two independent components: research idea quality and execution quality. Using a two-model ensemble of fine-tuned language models trained on publication decisions (Gong, Li, and Zhou, 2026) to evaluate idea quality and a comprehensive six-dimension rubric assessed by Gemini 3.1 Flash Lite -- the same model family used as the APE tournament judge, ensuring methodological consistency -- to evaluate execution quality, we analyze 953 economics papers -- 912 AI-generated papers from the APE project and 41 human papers published in the American Economic Review and AEJ: Economic Policy. The idea quality gap is large (Cohen's d = 2.23, p < 0.001), with human papers achieving 47.1% mean ensemble exceptional probability versus 16.5% for AI. The execution quality gap is also significant but smaller (d = 0.90, p < 0.001), with human papers scoring 4.38/5.0 versus 3.84. Idea quality accounts for approximately 71% of the overall quality difference, with execution contributing 29%. The largest execution weakness is mechanism analysis depth (d = 1.43); no significant difference is found on robustness. We document that 74% of AI papers employ difference-in-differences, and only 7 AI papers (0.8%) surpass the median human paper on both idea and execution quality simultaneously. The primary bottleneck to competitive AI-generated economics research remains ideation.