David Pennock

CY
h-index6
5papers
56citations
Novelty38%
AI Score40

5 Papers

HCMar 1, 2023
A prototype hybrid prediction market for estimating replicability of published work

Tatiana Chakravorti, Robert Fraleigh, Timothy Fritton et al.

We present a prototype hybrid prediction market and demonstrate the avenue it represents for meaningful human-AI collaboration. We build on prior work proposing artificial prediction markets as a novel machine-learning algorithm. In an artificial prediction market, trained AI agents buy and sell outcomes of future events. Classification decisions can be framed as outcomes of future events, and accordingly, the price of an asset corresponding to a given classification outcome can be taken as a proxy for the confidence of the system in that decision. By embedding human participants in these markets alongside bot traders, we can bring together insights from both. In this paper, we detail pilot studies with prototype hybrid markets for the prediction of replication study outcomes. We highlight challenges and opportunities, share insights from semi-structured interviews with hybrid market participants, and outline a vision for ongoing and future work.

93.1CYApr 19
Human-AI Collaboration for Estimating Scientific Replicability

Tatiana Chakravorti, Robert Fraleigh, Timothy Fritton et al.

Determining whether published scientific findings can successfully be replicated is a long-standing challenge in the empirical sciences. Existing approaches for replicability assessment typically rely either on human judgment, i.e., creative assembly of human experts, or on machine learning models trained on paper content metadata. While both approaches have demonstrated value, each also has important limitations. Human forecasts can be influenced by cognitive biases and narrow exposure to the research literature, while automated assessments often struggle to capture contextual cues and subtle signals of credibility. In this paper, we examine a hybrid approach. Specifically, we introduce a hybrid prediction market in which algorithmic agents trade alongside human participants to jointly estimate the likelihood that a published scientific finding will be corroborated via the outcome of a controlled replication study. Agents are trained on outcomes from hundreds of prior replication studies while human participants contribute domain knowledge through real-time trading. We evaluate this hybrid approach through multiple live experiments involving participants from different academic disciplines and compare its performance to artificial-only and human-only baselines. Our results show that, except for a few cases, hybrid markets match or outperform artificial prediction markets, producing more accurate and reliable replication forecasts.

GTJun 2, 2025
Stochastically Dominant Peer Prediction

Yichi Zhang, Shengwei Xu, David Pennock et al.

Eliciting reliable human feedback is essential for many machine learning tasks, such as learning from noisy labels and aligning AI systems with human preferences. Peer prediction mechanisms incentivize truthful reporting without ground truth verification by scoring agents based on correlations with peers. Traditional mechanisms, which ensure that truth-telling maximizes the expected scores in equilibrium, can elicit honest information while assuming agents' utilities are linear functions of their scores. However, in practice, non-linear payment rules are usually preferred, or agents' utilities are inherently non-linear. We propose stochastically dominant truthfulness (SD-truthfulness) as a stronger guarantee: the score distribution of truth-telling stochastically dominates all other strategies, incentivizing truthful reporting for a wide range of monotone utility functions. Our first observation is that no existing peer prediction mechanism naturally satisfies this criterion without strong assumptions. A simple solution -- rounding scores into binary lotteries -- can enforce SD-truthfulness, but often degrades sensitivity, a key property related to fairness and statistical efficiency. We demonstrate how a more careful application of rounding can better preserve sensitivity. Furthermore, we introduce a new enforced agreement (EA) mechanism that is theoretically guaranteed to be SD-truthful in binary-signal settings under mild assumptions, and empirically achieves the highest sensitivity among all known SD-truthful mechanisms.

CYDec 23, 2021
A Synthetic Prediction Market for Estimating Confidence in Published Work

Sarah Rajtmajer, Christopher Griffin, Jian Wu et al.

Explainably estimating confidence in published scholarly work offers opportunity for faster and more robust scientific progress. We develop a synthetic prediction market to assess the credibility of published claims in the social and behavioral sciences literature. We demonstrate our system and detail our findings using a collection of known replication projects. We suggest that this work lays the foundation for a research agenda that creatively uses AI for peer review.

AIJan 31, 2012
Learning Performance of Prediction Markets with Kelly Bettors

Alina Beygelzimer, John Langford, David Pennock

In evaluating prediction markets (and other crowd-prediction mechanisms), investigators have repeatedly observed a so-called "wisdom of crowds" effect, which roughly says that the average of participants performs much better than the average participant. The market price---an average or at least aggregate of traders' beliefs---offers a better estimate than most any individual trader's opinion. In this paper, we ask a stronger question: how does the market price compare to the best trader's belief, not just the average trader. We measure the market's worst-case log regret, a notion common in machine learning theory. To arrive at a meaningful answer, we need to assume something about how traders behave. We suppose that every trader optimizes according to the Kelly criteria, a strategy that provably maximizes the compound growth of wealth over an (infinite) sequence of market interactions. We show several consequences. First, the market prediction is a wealth-weighted average of the individual participants' beliefs. Second, the market learns at the optimal rate, the market price reacts exactly as if updating according to Bayes' Law, and the market prediction has low worst-case log regret to the best individual participant. We simulate a sequence of markets where an underlying true probability exists, showing that the market converges to the true objective frequency as if updating a Beta distribution, as the theory predicts. If agents adopt a fractional Kelly criteria, a common practical variant, we show that agents behave like full-Kelly agents with beliefs weighted between their own and the market's, and that the market price converges to a time-discounted frequency. Our analysis provides a new justification for fractional Kelly betting, a strategy widely used in practice for ad-hoc reasons. Finally, we propose a method for an agent to learn her own optimal Kelly fraction.