LG AIJan 9

Future-as-Label: Scalable Supervision from Real-World Outcomes

Benjamin Turtel, Paul Wilczewski, Danny Franklin, Kris Skothiem

arXiv:2601.06336v25 citationsh-index: 2

Originality Highly original

AI Analysis

This provides scalable outcome-based supervision for open-world prediction, addressing the annotation bottleneck in forecasting applications.

The paper tackles the problem of obtaining supervision for real-world event prediction by using time-resolved outcomes as labels, training language models to make probabilistic forecasts with proper scoring rules as rewards. The approach improved Brier score by 27% and halved calibration error on benchmarks, with a smaller model outperforming a 7x larger one on future-event prediction tasks.

Time creates free supervision: forecasts about real-world events resolve to verifiable outcomes. The passage of time provides labels that require no annotation. To exploit this structure, we extend reinforcement learning with verifiable rewards to real-world prediction over time. We train language models to make probabilistic forecasts from causally masked information, using proper scoring rules as the reward function once events resolve. Learning is driven entirely by realized outcomes, enabling scalable outcome-based supervision in open-world prediction. On real-world forecasting benchmarks, Qwen3-32B trained using Foresight Learning improves Brier score by 27% and halves calibration error relative to its pretrained baseline, and outperforms Qwen3-235B on both constructed future-event prediction tasks and the Metaculus benchmark despite a 7x parameter disadvantage.

View on arXiv PDF

Similar