GNAIFeb 2

The Strategic Foresight of LLMs: Evidence from a Fully Prospective Venture Tournament

arXiv:2602.01684v11 citationsh-index: 17
Originality Incremental advance
AI Analysis

This addresses the problem of improving prediction accuracy in high-stakes, uncertain scenarios like venture funding, showing AI's potential to surpass human experts, though it is incremental as it applies existing LLMs to a new domain.

The study tested whether large language models (LLMs) can outperform humans in strategic foresight by predicting fundraising success for live Kickstarter ventures, finding that frontier LLMs achieved rank correlations up to 0.74, significantly higher than human evaluators who scored between 0.04 and 0.45.

Can artificial intelligence outperform humans at strategic foresight -- the capacity to form accurate judgments about uncertain, high-stakes outcomes before they unfold? We address this question through a fully prospective prediction tournament using live Kickstarter crowdfunding projects. Thirty U.S.-based technology ventures, launched after the training cutoffs of all models studied, were evaluated while fundraising remained in progress and outcomes were unknown. A diverse suite of frontier and open-weight large language models (LLMs) completed 870 pairwise comparisons, producing complete rankings of predicted fundraising success. We benchmarked these forecasts against 346 experienced managers recruited via Prolific and three MBA-trained investors working under monitored conditions. The results are striking: human evaluators achieved rank correlations with actual outcomes between 0.04 and 0.45, while several frontier LLMs exceeded 0.60, with the best (Gemini 2.5 Pro) reaching 0.74 -- correctly ordering nearly four of every five venture pairs. These differences persist across multiple performance metrics and robustness checks. Neither wisdom-of-the-crowd ensembles nor human-AI hybrid teams outperformed the best standalone model.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes