LG AI CL IRFeb 28, 2024

Approaching Human-Level Forecasting with Language Models

Danny Halawi, Fred Zhang, Chen Yueh-Han, Jacob Steinhardt

BerkeleyDeepMind

arXiv:2402.18563v133.096 citationsh-index: 13NIPS

Originality Incremental advance

AI Analysis

This work addresses the need for scalable and accurate forecasting to inform institutional decision-making, representing a strong specific gain rather than a broad paradigm shift.

The researchers tackled the problem of forecasting future events by developing a retrieval-augmented language model system, which on average neared and in some settings surpassed the aggregate performance of competitive human forecasters on a dataset from forecasting platforms.

Forecasting future events is important for policy and decision making. In this work, we study whether language models (LMs) can forecast at the level of competitive human forecasters. Towards this goal, we develop a retrieval-augmented LM system designed to automatically search for relevant information, generate forecasts, and aggregate predictions. To facilitate our study, we collect a large dataset of questions from competitive forecasting platforms. Under a test set published after the knowledge cut-offs of our LMs, we evaluate the end-to-end performance of our system against the aggregates of human forecasts. On average, the system nears the crowd aggregate of competitive forecasters, and in some settings surpasses it. Our work suggests that using LMs to forecast the future could provide accurate predictions at scale and help to inform institutional decision making.

View on arXiv PDF

Similar